Add data type to JSON values query - sql

I am building the following view on SQL Server. The data is extracted from Soap API via Data Factory and stored in SQL table. I union two pieces of the same code because I am getting outputs from API as a Objects or Arrays.
This query works, however when try to order or filter the view got this error:
JSON text is not properly formatted. Unexpected character '.' is found at position 12.
/*Object - invoiceDetails*/
SELECT
XML.[CustomerID],
XML.[SiteId],
XML.[Date],
JSON_VALUE(m.[value], '$.accountNbr') AS [AccountNumber],
JSON_VALUE(m.[value], '$.actualUsage') AS [ActualUsage]
FROM stage.Bill XML
CROSS APPLY openjson(XML.xmldata) AS n
CROSS APPLY openjson(n.value, '$.invoiceDetails') AS m
WHERE XML.XmlData IS NOT NULL
AND ISJSON (XML.xmldata) > 0
AND n.type = 5
AND m.type = 5
UNION ALL
/*Array - invoiceDetails*/
SELECT
XML.[CustomerID],
XML.[SiteId],
XML.[Date],
JSON_VALUE(o.[value], '$.accountNbr') AS [AccountNumber],
JSON_VALUE(o.[value], '$.actualUsage') AS [ActualUsage]
FROM stage.Bill XML
CROSS APPLY openjson(XML.xmldata) AS n
CROSS APPLY openjson(n.value) AS m
CROSS APPLY openjson(m.value, '$.invoiceDetails') AS o
WHERE XML.XmlData IS NOT NULL
AND ISJSON (XML.xmldata) > 0
AND n.type = 4
Just did a small exercise using the WITH clause in order to specify data types to the values and noticed I am able to order and filter the view. So I believe the way is to add data type to this query. My problem now is that I don't know how to add WITH clause to my query in order to make it work.
Any advice?
My apologies, you might find the XML naming and prefixes confusing. In my first tests, I supposed to received XML data from Data Factory so my table and columns includes XML prefixes untill I noticed Data Factory delivers JSON data, so started query data as JSON but I have not changed table name and prefixes.
Thank you for your comments to enrich this question. I took this example below from a YouTube channel (Not sure if I am allowed to mention channel's name:https://www.youtube.com/watch?v=yl9jKGgASTY&t=474s). I believe it is a similar approach. Could you please help how could I add the WITH clause in order to specify data types?
DECLARE #json NVARCHAR(MAX)
SET #json =
N'{
"OrderHeader": [
{
"OrderID": 100,
"CustomerID": 2000,
"OrderDetail": [
{
"ProductID": 2000,
"UnitPrice": 350
},
{
"ProductID": 5000,
"UnitPrice": 800
},
{
"ProductID": 9000,
"UnitPrice": 200
}
]
}
]
}'
SELECT
JSON_VALUE(a.value, '$.OrderID') AS OrderID,
JSON_VALUE(a.value, '$.CustomerID') AS CustomerID,
JSON_VALUE(a.value, '$.ProductID') AS ProductID,
JSON_VALUE(a.value, '$.UnitPrice') AS UnitPrice
FROM OPENJSON(#json, '$.OrderHeader') AS a
CROSS APPLY OPENJSON(a.value, 'OrderDetail') AS b

You have a couple of typos.
The second OPENJSON should have the path starting $.
The third and fourth SELECT columns should use b.value not a.value
DECLARE #json NVARCHAR(MAX)
SET #json =
N'{
"OrderHeader": [
{
"OrderID": 100,
"CustomerID": 2000,
"OrderDetail": [
{
"ProductID": 2000,
"UnitPrice": 350
},
{
"ProductID": 5000,
"UnitPrice": 800
},
{
"ProductID": 9000,
"UnitPrice": 200
}
]
}
]
}'
SELECT
JSON_VALUE(a.value, '$.OrderID') AS OrderID,
JSON_VALUE(a.value, '$.CustomerID') AS CustomerID,
JSON_VALUE(b.value, '$.ProductID') AS ProductID,
JSON_VALUE(b.value, '$.UnitPrice') AS UnitPrice
FROM OPENJSON(#json, '$.OrderHeader') AS a
CROSS APPLY OPENJSON(a.value, '$.OrderDetail') AS b;
An alternative syntax is to use OPENJSON on each object with an explicit schema
SELECT
oh.OrderID,
oh.CustomerID,
od.ProductID,
od.UnitPrice
FROM OPENJSON(#json, '$.OrderHeader')
WITH (
OrderID int,
CustomerID int,
OrderDetail nvarchar(max) AS JSON
) AS oh
CROSS APPLY OPENJSON(oh.OrderDetail)
WITH (
ProductID int,
UnitPrice decimal(18,9)
) AS od;
db<>fiddle

Related

Dealing with nested JSON in SQL OPENJSON WITH command?

I have JSON value that has some structure to it but I'm struggling to get to the 3rd level. I am using CROSS APPLY OPENJSON to get to the "Lines" data but I need to get the "Code" out of the TaxCode area... It seems to be it's own JSON array maybe?
Any help would be appreciated... This is what I have so far...
DECLARE #JSONText NVarChar(max) = '{
"UID": "845bc256-6027-4a89-8c05-35e4bb8e6aba",
"Number": "00013608",
"Lines": [{
"RowID": 1,
"Total": 20.0,
"TaxCode": "#{UID=f2cc83e5-0f7f-4831-9d88-dbe110e0683a; Code=S15}"
},{
"RowID": 2,
"Total": 55.49,
"TaxCode": "#{UID=a5cc34e5-0fr4-4325-9d67-bdh110e0683a; Code=S17}"
}]
}';
SELECT J.[UID],J.[Number],LI.*
FROM OPENJSON (#JSONText)
WITH (
[UID] nvarchar(512) '$."UID"',
[Number] nvarchar(50) '$."Number"',
[LineItems] NVarChar(max) '$."Lines"' AS JSON
) J
CROSS APPLY OPENJSON (J.[LineItems])
WITH (
[RowID] INT '$."RowID"',
[Total] Decimal(12,2) '$."Total"',
[TaxCode] NVarChar(512) '$."TaxCode"',
[TaxCodeTest] NVarChar(50) '$."TaxCode.Code"'
) LI;
As the TaxCode value being supplied wasn't a "true" JSON object I had to use some SQL to work with it as a string end get the required data. I ended up creating a function to do it using inspiration from this question...
A SQL Query to select a string between two known strings

Extracting Array elements in Presto w/o using unnest function

I have a requirement around this data where I need to extract array elements but I still want to keep them grouped, which means I can not use unnest function. Below is the sample data:
[
{ "emp_id": 8291828, "name": "bruce", },
{ "emp_id": 8291823, "name": "Rolli" }
]
My data is in the same format as above,i.e. (array(row(emp_id varchar, name varchar))) what I need is to get rid of the array, so that data look like
{ "emp_id": 8291828, "name": "bruce", },
{ "emp_id": 8291823, "name": "Rolli" }
Would appreciate if anyone can help me on this.
You could use element_at If you have a sequence table (1,2,3, ..).
with numbers as
(
select * from
(
Values
(1),(2),(3)
) as x(i)
)
,emp as
(
select *
from (
values
(ARRAY[cast(ROW(8291828,'bruce') as row(emp_id bigint, name varchar)), cast(row(8291823,'Rolli') as row(emp_id bigint, name varchar))])
) as emp (records)
)
select
element_at(emp.records,i) record
from numbers n
cross join emp
where n.i <= cardinality(emp.records);

Adding a prefix to value derived from sql

I have some JSON data, that due to a quirk on the system it's extracted from, posts two values for the same item. Eg:
[
{
"data":{
"ecfa663b-3dd2-4aef-b25c-e43dd6b82enbRA004":"0001-01-01T00:00:00+00:00",
"ecfa663b-3dd2-4aef-b25c-e43dd6b82enbRA013":"0001-01-01T00:00:00+00:00",
"ecfa663b-3dd2-4aef-b25c-e43dd6b82enbMRA013":"2020-09-21T18:15:36.4919022+01:00",
"ecfa663b-3dd2-4aef-b25c-e43dd6b82enbRA010":"2020-09-21T18:12:35.4119042+01:00",
"ecfa663b-3dd2-4aef-b25c-e43dd6b82enbMRA004":"2020-09-21T00:00:00+01:00"
},
"columns":[
{
"name":"Assessment One",
"keySegment":"RA004",
"mandatory":"true"
},
{
"name":"Assessment Two",
"keySegment":"RA013",
"mandatory":"true"
},
{
"name":"Assessment Three",
"keySegment":"RA010",
"mandatory":"false"
}
]
}
]
RA004 and RA013 are correct, but they are classed as mandatory assessments by the software system and so an "M" prefix is added to the individual person's identifier (the longer number).
My (probably not very efficient) code for extracting this JSON into SQL Server is as follows:
BEGIN TRY
SELECT
LEFT(x.[Key],36) AS "ConnectionID",
y.name AS "Assessment", left(x.[Value],19) as "ReviewDate"
INTO [AssessmentToolsStatus]
FROM OPENJSON(#JSON, '$[0].data') AS x
CROSS APPLY
OPENJSON(#JSON, '$[0].columns')
WITH
(
name nvarchar(50) '$.name',
keySegment nvarchar(10) '$.keySegment',
mandatory nvarchar(10) '$.mandatory'
) y
WHERE REPLACE(x.[Key], Left(x.[Key],36),'') = y.keySegment
However, I'm unsure how to account for the "MRA"s. I want something like "if keySegment in 'RA004', 'RA013') then keySegment = 'M' + keySegment", but I don't quite know how in SQL. I don't need the RA004, RA013 entries because the date stamps are meaningless.
Can anyone help?
Thanks
If I understand the question correctly, a possible solution is the below statement (I assume, that RA004 and RA013 are the items from the $.columns JSON array with $.mandatory equal to "true"):
SELECT
LEFT(x.[Key], 36) AS "ConnectionID",
y.name AS "Assessment",
LEFT(x.[Value], 19) AS "ReviewDate"
-- INTO [AssessmentToolsStatus]
FROM OPENJSON(#JSON, '$[0].data') AS x
JOIN OPENJSON(#JSON, '$[0].columns') WITH (
name nvarchar(50) '$.name',
keySegment nvarchar(10) '$.keySegment',
mandatory nvarchar(10) '$.mandatory'
) y ON
STUFF(x.[key], 1, 36, '') =
CONCAT(CASE WHEN y.mandatory = 'true' THEN 'M' ELSE '' END, y.keySegment)
Result:
ConnectionID Assessment ReviewDate
ecfa663b-3dd2-4aef-b25c-e43dd6b82enb Assessment One 2020-09-21T00:00:00
ecfa663b-3dd2-4aef-b25c-e43dd6b82enb Assessment Two 2020-09-21T18:15:36
ecfa663b-3dd2-4aef-b25c-e43dd6b82enb Assessment Three 2020-09-21T18:12:35

SQL JSON Query -retrieve data within a JSON array

I have the following SQL "Product" table structure:
int Id
nvarchar(max) Details
Details contains JSON a string having the following structure:
{
"Id": "10001",
"Description": "example description",
"Variants": [{
"Title": "ABC / no",
"Price": "10"
}, {
"Title": "ABC / Yes",
"Price": "20",
}, {
"Title": "ABC / Yes",
"Price": "30",
}]
}
I need to write an SQL Query that would look through the table and return all the Variants with a particular title.
The following work
Get all rows from the table whose Details field contains a specific title
SELECT * FROM Products
WHERE JSON_VALUE(Details, '$.Description') = 'example description'
Get all rows from the table where Details.Variants[0].Title is equal to '{string}'
SELECT * FROM Products
WHERE JSON_VALUE(Details, '$.Variants[0].Title') = 'ABC / no'
Get all Ids from the table where Details.Variants[0].Title is equal to '{string}'
SELECT JSON_VALUE(Details, '$.Id')
FROM Products
WHERE JSON_VALUE(Details, '$.Variants[0].Title') = 'ABC / no'
I need to get all Variants from all rows in the Product table, where the Variant title is equal to '{string}'
There is a similar example in this documentation but I can't get it to work for my particuar case.
There is also this stack post
You need to use OPENJSON() with explicit schema (columns definitions) and additional APPLYs to parse the input JSON and get the expected results. Note, that you need to use AS JSON option to specify that the $.Variants part of the stored JSON is a JSON array.
Table:
CREATE TABLE Products (Id int, Details nvarchar(max))
INSERT INTO Products (Id, Details)
VALUES (1, N'{"Id":"10001","Description":"example description","Variants":[{"Title":"ABC / no","Price":"10"},{"Title":"ABC / Yes","Price":"20"},{"Title":"ABC / Yes","Price":"30"}]}"')
Statement:
SELECT p.Id, j1.Id, j1.Description, j2.Title, j2.Price
FROM Products p
CROSS APPLY OPENJSON (p.Details, '$') WITH (
Id int '$.Id',
[Description] nvarchar(100) '$.Description',
Variants nvarchar(max) '$.Variants' AS JSON
) j1
CROSS APPLY OPENJSON(j1.Variants) WITH (
Title nvarchar(100) '$.Title',
Price nvarchar(10) '$.Price'
) j2
WHERE
j2.Title = 'ABC / no'
-- or j1.Description = 'example description'
Result:
Id Id Description Title Price
1 10001 example description ABC / no 10

MS SQL json query/where clause nested array items

I have json data that i can query on using CROSS APPLY OPENJSON( which gets slow once you start adding multiple cross applies or once your json document get too large. So i wanted to add an index on the data im trying to filter on, but i cant get the syntax on nested array items to work with out using a cross apply. As such i cant create an index as you cant use a cross apply when making an index. According to the MS docs i should just be able to do
JSON_query(my_column, $.parentItem.nestedItemsArray1.nestedItemsArray2)
I should be able to get all the values of the nested, array items to then query on and improve performance by adding an index, something like this
ALTER TABLE mytable
ADD vdata AS JSON_query(my_column,
$.parentItem.nestedItemsArray1.nestedItemsArray2')
CREATE INDEX idx_json_my_column ON mytable(vdata)
but the above $.array.arrayitems syntax doesn't work ?
On a side note, I cant help but think in relational terms where normally in Sql you would index a column of data like so
col
---
1|
2|
3|
But json data seem to get flattened so when i use JSON_QUERY as per MS example i get "1,2,3" " I assume i want to incdex an array of values rather than a flattened version unless the index will return the inner data of the fattened data ?
my plug and play working example
declare #mydata table (
ID int NOT NULL,
jsondata varchar(max) NOT NULL
)
INSERT INTO #mydata (id, jsondata)
VALUES (789, '{ "Id": "12345", "FinanceProductResults": [ { "Term": 12, "AnnualMileage": 5000, "Deposits": 0, "ProductResults": [] }, { "Term": 18, "AnnualMileage": 30000, "Deposits": 15000, "ProductResults": [] }, { "Term": 24, "AnnualMileage": 5000, "Deposits": 0, "ProductResults": [ { "Key": "HP", "Payment": 460.28 } ] }, { "Term": 24, "AnnualMileage": 10000, "Deposits": 0, "ProductResults": [ { "Key": "HP", "Payment": 500.32 } ] }]}')
SELECT
j_Id
,JSON_query (c.value, '$.Term') as Term
,JSON_Value (c.value, '$.AnnualMileage') as AnnualMileage
,JSON_Value (c.value, '$.Deposits') as Deposits
,JSON_Value (p.value, '$.Key') as [Key]
,JSON_Value (p.value, '$.Payment') as Payment
--,c.value
FROM #mydata f
CROSS APPLY OPENJSON(f.jsondata)
WITH (j_Id nvarchar(100) '$.Id')
CROSS APPLY OPENJSON(f.jsondata, '$.FinanceProductResults') AS c
CROSS APPLY OPENJSON(c.value, '$."ProductResults"') AS p
where
ID = 789
AND JSON_Value (p.value, '$.Payment') = '460.28'
I'm using these MS docs to guide me :
How to create an index
How to get data
Update
I was able to improve performance slightly using the "with" method
SELECT
j_Id,
FinanceDetails.Term,
FinanceDetails.AnnualMileage,
FinanceDetails.Deposits,
Payments.Payment
FROM #mydata f
CROSS APPLY OPENJSON(f.jsondata)
WITH (j_Id nvarchar(100) '$.Id')
OUTER APPLY OPENJSON (f.jsondata, '$.FinanceProductResults' )
WITH (
Term INT '$.Term',
AnnualMileage INT '$.AnnualMileage',
Deposits INT '$.Deposits',
ProductResults NVARCHAR(MAX) '$.ProductResults' AS JSON
) AS FinanceDetails
OUTER APPLY OPENJSON(ProductResults, '$')
WITH (
Payment DECIMAL(19, 4) '$.Payment'
) AS Payments
WHERE
Payments.Payment = 460.28
but i still like to add an index on the sub array data to aid in improving performance ?
Currently, you cannot index nested properties.
Is Full-text search possible option? You might create FTS on JSON column and add predicate:
WHERE ....
AND CONTAINS( jsondata, 'NEAR(('Payments,460),1)')
Since JSON is text, this predicate will filter out all records that don't have something like "Payment" and 460 near to each other (this will identify key:value pairs), and you can apply CROSS APPLY on the reduced set of rows.