Using BOTH scalar values and JSON objects as JSON values - sql

I have a local table variable that I'm trying to populate with JSON key-value pairs. Sometimes the values are themselves JSON strings
DECLARE #Values TABLE
(
JsonKey varchar(200),
JsonValue varchar(max)
)
An example of what this ends up looking like:
+---------+--------------------------------------+
| JsonKey | JsonValue |
+---------+--------------------------------------+
| foo | bar |
| foo | [{"label":"fooBar","Id":"fooBarId"}] |
+---------+--------------------------------------+
After populating it, I attempt to build it all up into a single JSON string, like so:
DECLARE #Json JSON =
(
SELECT V.JsonKey as 'name',
V.JsonValue as 'value'
FROM #Values V
for json path
)
The problem with this is that it turns the JSON values into a string, rather than treating them as JSON. This results in those values not being parsed correctly.
An example of what it ends up looking like:
[
{
"name": "foo",
"value": "bar"
},
{
"name": "foo",
"value": "[{\"label\":\"fooBar\",\"Id\":\"fooBarId\"}]"
}
]
I am trying to get the JSON for the second value to NOT be escaped or wrapped in double quotes. What I would like to see is this:
[
{
"name": "foo",
"value": "bar"
},
{
"key": "foo",
"value": [
{
"label": "fooBar",
"Id": "fooBarId"
}
]
}
]
If that value will ONLY ever be JSON, I can instead use JSON_QUERY() in the JSON build-up, like this:
DECLARE #Json JSON =
(
SELECT V.JsonKey as 'name',
JSON_QUERY(V.JsonValue) as 'value'
FROM #Values V
for json path
)
Building it up like this gives me the result I want, but errors when the JsonValue column is not valid JSON. I attempted to put it in a case statement, to only use JSON_QUERY() when JsonValue was valid JSON, but since case statements are required to always output the same type, it turned it into a string again, and I got a repeat of the first example. I have not been able to find an elegant solution to this, and it really feels like there should be one that I'm just missing. Any help will be appreciated

One possible approach is to generate a statement with duplicate column names (JsonValue). By default FOR JSON AUTO does not include NULL values in the output, so the result is the expected JSON. Just note, that you must not use INCLUDE_NULL_VALUES in the statement or the final JSON will contain duplicate keys.
Table:
DECLARE #Values TABLE (
JsonKey varchar(200),
JsonValue varchar(max)
)
INSERT INTO #Values
(JsonKey, JsonValue)
VALUES
('foo', 'bar'),
('foo', '[{"label":"fooBar","Id":"fooBarId"}]')
Statement:
SELECT
JsonKey AS [name],
JSON_QUERY(CASE WHEN ISJSON(JsonValue) = 1 THEN JSON_QUERY(JsonValue) END) AS [value],
CASE WHEN ISJSON(JsonValue) = 0 THEN JsonValue END AS [value]
FROM #Values
FOR JSON AUTO
Result:
[{"name":"foo","value":"bar"},{"name":"foo","value":[{"label":"fooBar","Id":"fooBarId"}]}]

Related

How do I Unnest varchar to json in Athena

I am crawling data from Google Big Query and staging them into Athena.
One of the columns crawled as string, contains json :
{
"key": "Category",
"value": {
"string_value": "something"
}
I need to unnest these and flatten them to be able to use them in a query. I require key and string value (so in my query it will be where Category = something
I have tried the following :
WITH dataset AS (
SELECT cast(json_column as json) as json_column
from "thedatabase"
LIMIT 10
)
SELECT
json_extract_scalar(json_column, '$.value.string_value') AS string_value
FROM dataset
which is returning null.
Casting the json_column as json adds \ into them :
"[{\"key\":\"something\",\"value\":{\"string_value\":\"app\"}}
If I use replace on the json, it doesn't allow me as it's not a varchar object.
So how do I extract the values from the some_column field?
Presto's json_extract_scalar actually supports extracting just from the varchar (string) value :
-- sample data
WITH dataset(json_column) AS (
values ('{
"key": "Category",
"value": {
"string_value": "something"
}}')
)
--query
SELECT
json_extract_scalar(json_column, '$.value.string_value') AS string_value
FROM dataset;
Output:
string_value
something
Casting to json will encode data as json (in case of string you will get a double encoded one), not parse it, use json_parse (in this particular case it is not needed, but there are cases when you will want to use it):
-- query
SELECT
json_extract_scalar(json_parse(json_column), '$.value.string_value') AS string_value
FROM dataset;

Edit json with array of objects error, it's replacing the array with one object

I have a JSON column like this
[
{
"JoinedTime": "2021-04-13T20:09:40.654Z",
"LeftTime": "2021-04-13T20:09:53.368Z",
},
{
"JoinedTime": "2021-04-13T20:09:40.654Z",
"LeftTime": null,
},
]
And I have to update all null 'LeftTime' properties to GETUTCDATE(), so change that one 'null' value to the current GETUTCDATE().
I've gotten this far
UPDATE JoinedLeft
SET JsonColumn = JSON_MODIFY(js.[value], '$.LeftTime', FORMAT(GETUTCDATE(), 'yyyy-MM-dd"T"HH:mm:ss"Z"'))
FROM JoinedLeft JL
CROSS APPLY OPENJSON(JL.JsonColumn) AS js
WHERE
JSON_VALUE(js.[value], '$.LeftTime') IS NULL AND
JSON_VALUE(js.[value], '$.JoinedTime') IS NOT NULL
But it just replaces the column with just the object that I wanted to edit, instead of editing the object and saving the array again.
Can someone help me?
When you parse a JSON array with OPENJSON() and default schema (without the WITH clause), the result is a table with rows for each item in the parsed JSON array. This explains the enexpected results from the UPDATE statement. I don't think that JSON_MODIFY() supports wildcards as path parameter, so one possible option is to parse, modify and build the JSON array again:
Table:
SELECT JsonColumn
INTO JoinedLeft
FROM (VALUES
('[
{"JoinedTime": "2021-04-13T20:09:40.654Z","LeftTime": "2021-04-13T20:09:53.368Z"},
{"JoinedTime": "2021-04-14T21:09:40.654Z", "LeftTime": null}
]'),
('[
{"JoinedTime": "2021-05-14T21:09:40.654Z", "LeftTime": null}
]')
) t (JsonColumn)
Statement:
UPDATE JL
SET JL.JsonColumn = V.JsonColumn
FROM JoinedLeft JL
CROSS APPLY (
SELECT
JoinedTime,
COALESCE(LeftTime, FORMAT(GETUTCDATE(), 'yyyy-MM-dd"T"HH:mm:ss"Z"')) AS LeftTime
FROM OPENJSON(JL.JsonColumn, '$') WITH (
JoinedTime varchar(24) '$.JoinedTime',
LeftTime varchar(24) '$.LeftTime'
) AS JsonColumn
FOR JSON PATH
) V (JsonColumn);
Result:
JsonColumn
[{"JoinedTime":"2021-04-13T20:09:40.654Z","LeftTime":"2021-04-13T20:09:53.368Z"},{"JoinedTime":"2021-04-14T21:09:40.654Z","LeftTime":"2021-05-18T12:12:35Z"}]
[{"JoinedTime":"2021-05-14T21:09:40.654Z","LeftTime":"2021-05-18T12:12:35Z"}]

How to read JSON key values as a data column in Snowflake?

I have the below sample JSON:
{
"Id1": {
"name": "Item1.jpg",
"Status": "Approved"
},
"Id2": {
"name": "Item2.jpg",
"Status": "Approved"
}
}
and I am trying to get the following output:
_key name Status
Id1 Item1.jpg Approved
Id2 Item2.jpg Approved
Is there any way I can achieve this in Snowflake using SQL?
You should use Snowflake's VARIANT data type in any column holding JSON data. Let's break this down step by step:
create temporary table FOO(v variant); -- Temp table to hold the JSON. Often you'll see a variant column simply called "V"
-- Insert into the variant column. Parse the JSON because variants don't hold string types. They hold semi-structured types.
insert into FOO select parse_json('{"Id1": {"name": "Item1.jpg", "Status": "Approved"}, "Id2": {"name": "Item2.jpg", "Status": "Approved"}}');
-- See how it looks in its raw state
select * from FOO;
-- Flatten the top-level JSON. The flatten function breaks down the JSON into several usable columns
select * from foo, lateral flatten(input => (foo.v)) ;
-- Now traverse the JSON using the column name and : to get to the property you want. Cast to string using ::string.
-- If you must have exact case on your column names, you need to double quote them.
select KEY as "_key",
VALUE:name::string as "name",
VALUE:Status::string as "Status"
from FOO, lateral flatten(input => (FOO.V)) ;

Convert a mssql openjson array type value result to a table?

I have a json object in my Microsoft (MS) SQL Server query. This JSON object does have one value, which is an array of strings.
--this variable holds my JSON object with a value of array type.
declare #json nvarchar(max) = N'{
"value": [
"tapiwanashe",
"robert",
"emmerson",
"ruwimbo",
"takudzwa",
"munyaradzi"
]
}'
My goal is to write a SQL query using the supported MS SQL Server JSON functions that produces a table with one column and six rows of the values in the JSON object value array above.
I have tried to run the JSON_QUERY and the OPENJSON functions. However, both of the two functions return an array of strings as the output. I would like to have a result with one column and six rows.
select JSON_QUERY(#json, '$.value')
select [value] from OPENJSON(#json)
The result I am getting is:
value
---------------
[
"tapiwanashe",
"robert",
"emmerson",
"ruwimbo",
"takudzwa",
"munyaradzi"
]
However, the result I am expecting to get looks like this:
value
-----------
tapiwanashe
robert
emmerson
ruwimbo
takudzwa
munyaradzi
The result must preserve the order on which the values appear in the value array.
Like this:
declare #json nvarchar(max) = N'{
"value": [
"tapiwanashe",
"robert",
"emmerson",
"ruwimbo",
"takudzwa",
"munyaradzi"
]
}'
select value
from openjson(#json,'$.value')
order by [key]
outputs
value
----------
tapiwanashe
robert
emmerson
ruwimbo
takudzwa
munyaradzi

SELECT JSON_VALUE From table returns null instead of value

JSON stored in a column 'DataJson' in table
[{
"KickOffDate": "1-Jan-2019",
"TeamSize": "11",
"ClientEngineer": "Sagar",
"WaitingPeriod": "16.5"
}]
Query
SELECT JSON_VALUE(DataJson,'$.KickOffDate') AS KickOffDate
, JSON_VALUE(DataJson,'$.ClientEngineer') AS ClientEngineer
FROM [ABC].[Deliver]
Result
KickOffDate ClientEngineer
NULL NULL
Result should be:
KickOffDate ClientEngineer
1-Jan-2019 Sagar
Your sql query is wrong.
You have to correct query like below.
SELECT JSON_VALUE(DataJson,'$[0].KickOffDate') AS KickOffDate ,JSON_VALUE(DataJson,'$[0].ClientEngineer') AS ClientEngineer FROM [ABC].[Deliver]
The data stored in table is not JSON Object, it's JSON Array.
So in order to get each value of JSON Object, need to set index of JSON Object in JSON Array.
Otherwise, you can store data as JSON Object, and then your query can be work normally.
Your JSON appears to be malformed, at least from the point of view of SQL Server's JSON API. From what I have read, if your JSON data consists of a top level JSON array, then the array needs to have a key name, and also the entire contents should be wrapped in { ... }.
The following setup has been tested and works:
WITH yourTable AS (
SELECT '{ "data" : [{"KickOffDate": "1-Jan-2019", "TeamSize": "11", "ClientEngineer": "Sagar", "WaitingPeriod": "16.5"}] }' AS DataJson
)
SELECT
JSON_VALUE(DataJson, '$.data[0].KickOffDate') AS KickOffDate,
JSON_VALUE(DataJson, '$.data[0].ClientEngineer') AS ClientEngineer
FROM yourTable;
Demo
Here is what the input JSON I used looks like:
{
"data" : [
{
"KickOffDate": "1-Jan-2019",
"TeamSize": "11",
"ClientEngineer": "Sagar",
"WaitingPeriod": "16.5"
}
]
}