Edit json with array of objects error, it's replacing the array with one object - sql

I have a JSON column like this
[
{
"JoinedTime": "2021-04-13T20:09:40.654Z",
"LeftTime": "2021-04-13T20:09:53.368Z",
},
{
"JoinedTime": "2021-04-13T20:09:40.654Z",
"LeftTime": null,
},
]
And I have to update all null 'LeftTime' properties to GETUTCDATE(), so change that one 'null' value to the current GETUTCDATE().
I've gotten this far
UPDATE JoinedLeft
SET JsonColumn = JSON_MODIFY(js.[value], '$.LeftTime', FORMAT(GETUTCDATE(), 'yyyy-MM-dd"T"HH:mm:ss"Z"'))
FROM JoinedLeft JL
CROSS APPLY OPENJSON(JL.JsonColumn) AS js
WHERE
JSON_VALUE(js.[value], '$.LeftTime') IS NULL AND
JSON_VALUE(js.[value], '$.JoinedTime') IS NOT NULL
But it just replaces the column with just the object that I wanted to edit, instead of editing the object and saving the array again.
Can someone help me?

When you parse a JSON array with OPENJSON() and default schema (without the WITH clause), the result is a table with rows for each item in the parsed JSON array. This explains the enexpected results from the UPDATE statement. I don't think that JSON_MODIFY() supports wildcards as path parameter, so one possible option is to parse, modify and build the JSON array again:
Table:
SELECT JsonColumn
INTO JoinedLeft
FROM (VALUES
('[
{"JoinedTime": "2021-04-13T20:09:40.654Z","LeftTime": "2021-04-13T20:09:53.368Z"},
{"JoinedTime": "2021-04-14T21:09:40.654Z", "LeftTime": null}
]'),
('[
{"JoinedTime": "2021-05-14T21:09:40.654Z", "LeftTime": null}
]')
) t (JsonColumn)
Statement:
UPDATE JL
SET JL.JsonColumn = V.JsonColumn
FROM JoinedLeft JL
CROSS APPLY (
SELECT
JoinedTime,
COALESCE(LeftTime, FORMAT(GETUTCDATE(), 'yyyy-MM-dd"T"HH:mm:ss"Z"')) AS LeftTime
FROM OPENJSON(JL.JsonColumn, '$') WITH (
JoinedTime varchar(24) '$.JoinedTime',
LeftTime varchar(24) '$.LeftTime'
) AS JsonColumn
FOR JSON PATH
) V (JsonColumn);
Result:
JsonColumn
[{"JoinedTime":"2021-04-13T20:09:40.654Z","LeftTime":"2021-04-13T20:09:53.368Z"},{"JoinedTime":"2021-04-14T21:09:40.654Z","LeftTime":"2021-05-18T12:12:35Z"}]
[{"JoinedTime":"2021-05-14T21:09:40.654Z","LeftTime":"2021-05-18T12:12:35Z"}]

Related

Extract complex json with random key field

I am trying to extract the following JSON into its own rows like the table below in Presto query. The issue here is the name of the key/av engine name is different for each row, and I am stuck on how I can extract and iterate on the keys without knowing the value of the key.
The json is a value of a table row
{
"Bkav":
{
"detected": false,
"result": null,
},
"Lionic":
{
"detected": true,
"result": Trojan.Generic.3611249',
},
...
AV Engine Name
Detected Virus
Result
Bkav
false
null
Lionic
true
Trojan.Generic.3611249
I have tried to use json_extract following the documentation here https://teradata.github.io/presto/docs/141t/functions/json.html but there is no mention of extraction if we don't know the key :( I am trying to find a solution that works in both presto & hive query, is there a common query that is applicable to both?
You can cast your json to map(varchar, json) and process it with unnest to flatten:
-- sample data
WITH dataset (json_str) AS (
VALUES (
'{"Bkav":{"detected": false,"result": null},"Lionic":{"detected": true,"result": "Trojan.Generic.3611249"}}'
)
)
--query
select k "AV Engine Name", json_extract_scalar(v, '$.detected') "Detected Virus", json_extract_scalar(v, '$.result') "Result"
from (
select cast(json_parse(json_str) as map(varchar, json)) as m
from dataset
)
cross join unnest (map_keys(m), map_values(m)) t(k, v)
Output:
AV Engine Name
Detected Virus
Result
Bkav
false
Lionic
true
Trojan.Generic.3611249
The presto query suggested by #Guru works, but for hive, there is no easy way.
I had to extract the json
Parse it with replace to remove some character and bracket
Then convert it back to a map, and repeat for one more time to get the nested value out
SELECT
av_engine,
str_to_map(regexp_replace(engine_result, '\\}', ''),',', ':') AS output_map
FROM (
SELECT
str_to_map(regexp_replace(regexp_replace(get_json_object(raw_response, '$.scans'), '\"', ''), '\\{',''),'\\},', ':') AS key_val_map
FROM restricted_antispam.abuse_malware_scanning
) AS S
LATERAL VIEW EXPLODE(key_val_map) temp AS av_engine, engine_result

Query for retrieve matching json Objects as a list

Assume i have a table called MyTable and this table have a JSON type column called myjson and this column have next value as a json array hold multiple objects, for example like next:
[
{
"budgetType": "CF",
"financeNumber": 1236547,
"budget": 1000000
},
{
"budgetType": "ENVELOPE",
"financeNumber": 1236888,
"budget": 2000000
}
]
So how i can search if the record has any JSON objects inside its JSON array with financeNumber=1236547
Something like this:
SELECT
t.*
FROM
"MyTable",
LATERAL json_to_recordset(myjson) AS t ("budgetType" varchar,
"financeNumber" int,
budget varchar)
WHERE
"financeNumber" = 1236547;
Obviously not tested on your data, but it should provide a starting point.
with a as(
SELECT json_array_elements(myjson)->'financeNumber' as col FROM mytable)
select exists(select from a where col::text = '1236547'::text );
https://www.postgresql.org/docs/current/functions-json.html
json_array_elements return setof json, so you need cast.
Check if a row exists: Fastest check if row exists in PostgreSQL

Reading JSON string and find the max value as integer

I have a JSON string as follows:
DECLARE #json nvarchar(max)
SET #json = '{"value": [
{
"AEDAT": "20211110"
},
{
"AEDAT": "20211110"
},
{
"AEDAT": "20211110"
},
{
"AEDAT": "20211112"
},
{
"AEDAT": "20211112"
},
{
"AEDAT": "20211112"
}
]}';
Now I want to read this JSON in SQL Server using OPENJSON() and find the MAX value for each AEDAT. For this, I am using the following query:
SELECT MAX(value)
FROM OPENJSON(#json, '$.value')
The above query is returning a row with key value pair as below:
{"AEDAT":"20211112"}
My objective is to get only 20211112 as integer.
How to achieve this?
If you want to get the max value as integer, you need to use OPENJSON() with explicit schema (the WITH clause with columns definitions). This schema depends on the structure of the parsed JSON (in your case it's a JSON array):
SELECT MAX(AEDAT) AS MaxAEDAT
FROM OPENJSON(#json, '$.value') WITH (
AEDAT int '$.AEDAT'
)
If the parsed values are dates, you may try a different statement:
SELECT MAX(TRY_CONVERT(date, AEDAT, 112)) AS MaxAEDAT
FROM OPENJSON(#json, '$.value') WITH (
AEDAT varchar(8) '$.AEDAT'
)
OPENJSON without explicit schema, gives you the value column which, in your example, will contain an object such as {"AEDAT": "20211110"} having type = 5. Use JSON_VALUE on that object:
select max(cast(json_value(j.value, '$.AEDAT') as int))
from openjson(#json, '$.value') as j

Using BOTH scalar values and JSON objects as JSON values

I have a local table variable that I'm trying to populate with JSON key-value pairs. Sometimes the values are themselves JSON strings
DECLARE #Values TABLE
(
JsonKey varchar(200),
JsonValue varchar(max)
)
An example of what this ends up looking like:
+---------+--------------------------------------+
| JsonKey | JsonValue |
+---------+--------------------------------------+
| foo | bar |
| foo | [{"label":"fooBar","Id":"fooBarId"}] |
+---------+--------------------------------------+
After populating it, I attempt to build it all up into a single JSON string, like so:
DECLARE #Json JSON =
(
SELECT V.JsonKey as 'name',
V.JsonValue as 'value'
FROM #Values V
for json path
)
The problem with this is that it turns the JSON values into a string, rather than treating them as JSON. This results in those values not being parsed correctly.
An example of what it ends up looking like:
[
{
"name": "foo",
"value": "bar"
},
{
"name": "foo",
"value": "[{\"label\":\"fooBar\",\"Id\":\"fooBarId\"}]"
}
]
I am trying to get the JSON for the second value to NOT be escaped or wrapped in double quotes. What I would like to see is this:
[
{
"name": "foo",
"value": "bar"
},
{
"key": "foo",
"value": [
{
"label": "fooBar",
"Id": "fooBarId"
}
]
}
]
If that value will ONLY ever be JSON, I can instead use JSON_QUERY() in the JSON build-up, like this:
DECLARE #Json JSON =
(
SELECT V.JsonKey as 'name',
JSON_QUERY(V.JsonValue) as 'value'
FROM #Values V
for json path
)
Building it up like this gives me the result I want, but errors when the JsonValue column is not valid JSON. I attempted to put it in a case statement, to only use JSON_QUERY() when JsonValue was valid JSON, but since case statements are required to always output the same type, it turned it into a string again, and I got a repeat of the first example. I have not been able to find an elegant solution to this, and it really feels like there should be one that I'm just missing. Any help will be appreciated
One possible approach is to generate a statement with duplicate column names (JsonValue). By default FOR JSON AUTO does not include NULL values in the output, so the result is the expected JSON. Just note, that you must not use INCLUDE_NULL_VALUES in the statement or the final JSON will contain duplicate keys.
Table:
DECLARE #Values TABLE (
JsonKey varchar(200),
JsonValue varchar(max)
)
INSERT INTO #Values
(JsonKey, JsonValue)
VALUES
('foo', 'bar'),
('foo', '[{"label":"fooBar","Id":"fooBarId"}]')
Statement:
SELECT
JsonKey AS [name],
JSON_QUERY(CASE WHEN ISJSON(JsonValue) = 1 THEN JSON_QUERY(JsonValue) END) AS [value],
CASE WHEN ISJSON(JsonValue) = 0 THEN JsonValue END AS [value]
FROM #Values
FOR JSON AUTO
Result:
[{"name":"foo","value":"bar"},{"name":"foo","value":[{"label":"fooBar","Id":"fooBarId"}]}]

MS SQL json query/where clause nested array items

I have json data that i can query on using CROSS APPLY OPENJSON( which gets slow once you start adding multiple cross applies or once your json document get too large. So i wanted to add an index on the data im trying to filter on, but i cant get the syntax on nested array items to work with out using a cross apply. As such i cant create an index as you cant use a cross apply when making an index. According to the MS docs i should just be able to do
JSON_query(my_column, $.parentItem.nestedItemsArray1.nestedItemsArray2)
I should be able to get all the values of the nested, array items to then query on and improve performance by adding an index, something like this
ALTER TABLE mytable
ADD vdata AS JSON_query(my_column,
$.parentItem.nestedItemsArray1.nestedItemsArray2')
CREATE INDEX idx_json_my_column ON mytable(vdata)
but the above $.array.arrayitems syntax doesn't work ?
On a side note, I cant help but think in relational terms where normally in Sql you would index a column of data like so
col
---
1|
2|
3|
But json data seem to get flattened so when i use JSON_QUERY as per MS example i get "1,2,3" " I assume i want to incdex an array of values rather than a flattened version unless the index will return the inner data of the fattened data ?
my plug and play working example
declare #mydata table (
ID int NOT NULL,
jsondata varchar(max) NOT NULL
)
INSERT INTO #mydata (id, jsondata)
VALUES (789, '{ "Id": "12345", "FinanceProductResults": [ { "Term": 12, "AnnualMileage": 5000, "Deposits": 0, "ProductResults": [] }, { "Term": 18, "AnnualMileage": 30000, "Deposits": 15000, "ProductResults": [] }, { "Term": 24, "AnnualMileage": 5000, "Deposits": 0, "ProductResults": [ { "Key": "HP", "Payment": 460.28 } ] }, { "Term": 24, "AnnualMileage": 10000, "Deposits": 0, "ProductResults": [ { "Key": "HP", "Payment": 500.32 } ] }]}')
SELECT
j_Id
,JSON_query (c.value, '$.Term') as Term
,JSON_Value (c.value, '$.AnnualMileage') as AnnualMileage
,JSON_Value (c.value, '$.Deposits') as Deposits
,JSON_Value (p.value, '$.Key') as [Key]
,JSON_Value (p.value, '$.Payment') as Payment
--,c.value
FROM #mydata f
CROSS APPLY OPENJSON(f.jsondata)
WITH (j_Id nvarchar(100) '$.Id')
CROSS APPLY OPENJSON(f.jsondata, '$.FinanceProductResults') AS c
CROSS APPLY OPENJSON(c.value, '$."ProductResults"') AS p
where
ID = 789
AND JSON_Value (p.value, '$.Payment') = '460.28'
I'm using these MS docs to guide me :
How to create an index
How to get data
Update
I was able to improve performance slightly using the "with" method
SELECT
j_Id,
FinanceDetails.Term,
FinanceDetails.AnnualMileage,
FinanceDetails.Deposits,
Payments.Payment
FROM #mydata f
CROSS APPLY OPENJSON(f.jsondata)
WITH (j_Id nvarchar(100) '$.Id')
OUTER APPLY OPENJSON (f.jsondata, '$.FinanceProductResults' )
WITH (
Term INT '$.Term',
AnnualMileage INT '$.AnnualMileage',
Deposits INT '$.Deposits',
ProductResults NVARCHAR(MAX) '$.ProductResults' AS JSON
) AS FinanceDetails
OUTER APPLY OPENJSON(ProductResults, '$')
WITH (
Payment DECIMAL(19, 4) '$.Payment'
) AS Payments
WHERE
Payments.Payment = 460.28
but i still like to add an index on the sub array data to aid in improving performance ?
Currently, you cannot index nested properties.
Is Full-text search possible option? You might create FTS on JSON column and add predicate:
WHERE ....
AND CONTAINS( jsondata, 'NEAR(('Payments,460),1)')
Since JSON is text, this predicate will filter out all records that don't have something like "Payment" and 460 near to each other (this will identify key:value pairs), and you can apply CROSS APPLY on the reduced set of rows.