Adding a prefix to value derived from sql - sql

I have some JSON data, that due to a quirk on the system it's extracted from, posts two values for the same item. Eg:
[
{
"data":{
"ecfa663b-3dd2-4aef-b25c-e43dd6b82enbRA004":"0001-01-01T00:00:00+00:00",
"ecfa663b-3dd2-4aef-b25c-e43dd6b82enbRA013":"0001-01-01T00:00:00+00:00",
"ecfa663b-3dd2-4aef-b25c-e43dd6b82enbMRA013":"2020-09-21T18:15:36.4919022+01:00",
"ecfa663b-3dd2-4aef-b25c-e43dd6b82enbRA010":"2020-09-21T18:12:35.4119042+01:00",
"ecfa663b-3dd2-4aef-b25c-e43dd6b82enbMRA004":"2020-09-21T00:00:00+01:00"
},
"columns":[
{
"name":"Assessment One",
"keySegment":"RA004",
"mandatory":"true"
},
{
"name":"Assessment Two",
"keySegment":"RA013",
"mandatory":"true"
},
{
"name":"Assessment Three",
"keySegment":"RA010",
"mandatory":"false"
}
]
}
]
RA004 and RA013 are correct, but they are classed as mandatory assessments by the software system and so an "M" prefix is added to the individual person's identifier (the longer number).
My (probably not very efficient) code for extracting this JSON into SQL Server is as follows:
BEGIN TRY
SELECT
LEFT(x.[Key],36) AS "ConnectionID",
y.name AS "Assessment", left(x.[Value],19) as "ReviewDate"
INTO [AssessmentToolsStatus]
FROM OPENJSON(#JSON, '$[0].data') AS x
CROSS APPLY
OPENJSON(#JSON, '$[0].columns')
WITH
(
name nvarchar(50) '$.name',
keySegment nvarchar(10) '$.keySegment',
mandatory nvarchar(10) '$.mandatory'
) y
WHERE REPLACE(x.[Key], Left(x.[Key],36),'') = y.keySegment
However, I'm unsure how to account for the "MRA"s. I want something like "if keySegment in 'RA004', 'RA013') then keySegment = 'M' + keySegment", but I don't quite know how in SQL. I don't need the RA004, RA013 entries because the date stamps are meaningless.
Can anyone help?
Thanks

If I understand the question correctly, a possible solution is the below statement (I assume, that RA004 and RA013 are the items from the $.columns JSON array with $.mandatory equal to "true"):
SELECT
LEFT(x.[Key], 36) AS "ConnectionID",
y.name AS "Assessment",
LEFT(x.[Value], 19) AS "ReviewDate"
-- INTO [AssessmentToolsStatus]
FROM OPENJSON(#JSON, '$[0].data') AS x
JOIN OPENJSON(#JSON, '$[0].columns') WITH (
name nvarchar(50) '$.name',
keySegment nvarchar(10) '$.keySegment',
mandatory nvarchar(10) '$.mandatory'
) y ON
STUFF(x.[key], 1, 36, '') =
CONCAT(CASE WHEN y.mandatory = 'true' THEN 'M' ELSE '' END, y.keySegment)
Result:
ConnectionID Assessment ReviewDate
ecfa663b-3dd2-4aef-b25c-e43dd6b82enb Assessment One 2020-09-21T00:00:00
ecfa663b-3dd2-4aef-b25c-e43dd6b82enb Assessment Two 2020-09-21T18:15:36
ecfa663b-3dd2-4aef-b25c-e43dd6b82enb Assessment Three 2020-09-21T18:12:35

Related

Add data type to JSON values query

I am building the following view on SQL Server. The data is extracted from Soap API via Data Factory and stored in SQL table. I union two pieces of the same code because I am getting outputs from API as a Objects or Arrays.
This query works, however when try to order or filter the view got this error:
JSON text is not properly formatted. Unexpected character '.' is found at position 12.
/*Object - invoiceDetails*/
SELECT
XML.[CustomerID],
XML.[SiteId],
XML.[Date],
JSON_VALUE(m.[value], '$.accountNbr') AS [AccountNumber],
JSON_VALUE(m.[value], '$.actualUsage') AS [ActualUsage]
FROM stage.Bill XML
CROSS APPLY openjson(XML.xmldata) AS n
CROSS APPLY openjson(n.value, '$.invoiceDetails') AS m
WHERE XML.XmlData IS NOT NULL
AND ISJSON (XML.xmldata) > 0
AND n.type = 5
AND m.type = 5
UNION ALL
/*Array - invoiceDetails*/
SELECT
XML.[CustomerID],
XML.[SiteId],
XML.[Date],
JSON_VALUE(o.[value], '$.accountNbr') AS [AccountNumber],
JSON_VALUE(o.[value], '$.actualUsage') AS [ActualUsage]
FROM stage.Bill XML
CROSS APPLY openjson(XML.xmldata) AS n
CROSS APPLY openjson(n.value) AS m
CROSS APPLY openjson(m.value, '$.invoiceDetails') AS o
WHERE XML.XmlData IS NOT NULL
AND ISJSON (XML.xmldata) > 0
AND n.type = 4
Just did a small exercise using the WITH clause in order to specify data types to the values and noticed I am able to order and filter the view. So I believe the way is to add data type to this query. My problem now is that I don't know how to add WITH clause to my query in order to make it work.
Any advice?
My apologies, you might find the XML naming and prefixes confusing. In my first tests, I supposed to received XML data from Data Factory so my table and columns includes XML prefixes untill I noticed Data Factory delivers JSON data, so started query data as JSON but I have not changed table name and prefixes.
Thank you for your comments to enrich this question. I took this example below from a YouTube channel (Not sure if I am allowed to mention channel's name:https://www.youtube.com/watch?v=yl9jKGgASTY&t=474s). I believe it is a similar approach. Could you please help how could I add the WITH clause in order to specify data types?
DECLARE #json NVARCHAR(MAX)
SET #json =
N'{
"OrderHeader": [
{
"OrderID": 100,
"CustomerID": 2000,
"OrderDetail": [
{
"ProductID": 2000,
"UnitPrice": 350
},
{
"ProductID": 5000,
"UnitPrice": 800
},
{
"ProductID": 9000,
"UnitPrice": 200
}
]
}
]
}'
SELECT
JSON_VALUE(a.value, '$.OrderID') AS OrderID,
JSON_VALUE(a.value, '$.CustomerID') AS CustomerID,
JSON_VALUE(a.value, '$.ProductID') AS ProductID,
JSON_VALUE(a.value, '$.UnitPrice') AS UnitPrice
FROM OPENJSON(#json, '$.OrderHeader') AS a
CROSS APPLY OPENJSON(a.value, 'OrderDetail') AS b
You have a couple of typos.
The second OPENJSON should have the path starting $.
The third and fourth SELECT columns should use b.value not a.value
DECLARE #json NVARCHAR(MAX)
SET #json =
N'{
"OrderHeader": [
{
"OrderID": 100,
"CustomerID": 2000,
"OrderDetail": [
{
"ProductID": 2000,
"UnitPrice": 350
},
{
"ProductID": 5000,
"UnitPrice": 800
},
{
"ProductID": 9000,
"UnitPrice": 200
}
]
}
]
}'
SELECT
JSON_VALUE(a.value, '$.OrderID') AS OrderID,
JSON_VALUE(a.value, '$.CustomerID') AS CustomerID,
JSON_VALUE(b.value, '$.ProductID') AS ProductID,
JSON_VALUE(b.value, '$.UnitPrice') AS UnitPrice
FROM OPENJSON(#json, '$.OrderHeader') AS a
CROSS APPLY OPENJSON(a.value, '$.OrderDetail') AS b;
An alternative syntax is to use OPENJSON on each object with an explicit schema
SELECT
oh.OrderID,
oh.CustomerID,
od.ProductID,
od.UnitPrice
FROM OPENJSON(#json, '$.OrderHeader')
WITH (
OrderID int,
CustomerID int,
OrderDetail nvarchar(max) AS JSON
) AS oh
CROSS APPLY OPENJSON(oh.OrderDetail)
WITH (
ProductID int,
UnitPrice decimal(18,9)
) AS od;
db<>fiddle

SQL JSON Query -retrieve data within a JSON array

I have the following SQL "Product" table structure:
int Id
nvarchar(max) Details
Details contains JSON a string having the following structure:
{
"Id": "10001",
"Description": "example description",
"Variants": [{
"Title": "ABC / no",
"Price": "10"
}, {
"Title": "ABC / Yes",
"Price": "20",
}, {
"Title": "ABC / Yes",
"Price": "30",
}]
}
I need to write an SQL Query that would look through the table and return all the Variants with a particular title.
The following work
Get all rows from the table whose Details field contains a specific title
SELECT * FROM Products
WHERE JSON_VALUE(Details, '$.Description') = 'example description'
Get all rows from the table where Details.Variants[0].Title is equal to '{string}'
SELECT * FROM Products
WHERE JSON_VALUE(Details, '$.Variants[0].Title') = 'ABC / no'
Get all Ids from the table where Details.Variants[0].Title is equal to '{string}'
SELECT JSON_VALUE(Details, '$.Id')
FROM Products
WHERE JSON_VALUE(Details, '$.Variants[0].Title') = 'ABC / no'
I need to get all Variants from all rows in the Product table, where the Variant title is equal to '{string}'
There is a similar example in this documentation but I can't get it to work for my particuar case.
There is also this stack post
You need to use OPENJSON() with explicit schema (columns definitions) and additional APPLYs to parse the input JSON and get the expected results. Note, that you need to use AS JSON option to specify that the $.Variants part of the stored JSON is a JSON array.
Table:
CREATE TABLE Products (Id int, Details nvarchar(max))
INSERT INTO Products (Id, Details)
VALUES (1, N'{"Id":"10001","Description":"example description","Variants":[{"Title":"ABC / no","Price":"10"},{"Title":"ABC / Yes","Price":"20"},{"Title":"ABC / Yes","Price":"30"}]}"')
Statement:
SELECT p.Id, j1.Id, j1.Description, j2.Title, j2.Price
FROM Products p
CROSS APPLY OPENJSON (p.Details, '$') WITH (
Id int '$.Id',
[Description] nvarchar(100) '$.Description',
Variants nvarchar(max) '$.Variants' AS JSON
) j1
CROSS APPLY OPENJSON(j1.Variants) WITH (
Title nvarchar(100) '$.Title',
Price nvarchar(10) '$.Price'
) j2
WHERE
j2.Title = 'ABC / no'
-- or j1.Description = 'example description'
Result:
Id Id Description Title Price
1 10001 example description ABC / no 10

I need all value from JSON array

I have a table in which I have JSON data and the field type is NVARCHAR(4000)
[
{"number":1,"booked":0},
{"number":2,"booked":0},
{"number":3,"booked":0},
{"number":4,"booked":1},
{"number":5,"booked":0},
{"number":6,"booked":0},
{"number":7,"booked":0},
{"number":8,"booked":0}
]
I want to query on this field of array, and want the output that Number of booked is 1 and not booked are 7.
I have used JSON_VALUE(), JSON_QUERY() functions but not getting at the point.
I also want that Number:4 is booked.
I am using SQL Server 2016
Hi if i understand all what your're trying todo, thoses example can respond :
DECLARE #json NVARCHAR(MAX)
SET #json =
N'[
{"number":1,"booked":0},
{"number":2,"booked":0},
{"number":3,"booked":0},
{"number":4,"booked":1},
{"number":5,"booked":0},
{"number":6,"booked":0},
{"number":7,"booked":0},
{"number":8,"booked":0}
]'
SELECT number, booked
FROM OPENJSON(#json)
WITH (number int 'strict $.number', booked int 'strict $.booked')
WHERE booked = 1
In futur propose please provide some data and excepted output and query what your're trying .
[
{"number":1,"booked":0},
{"number":2,"booked":0},
{"number":3,"booked":0},
{"number":4,"booked":1},
{"number":5,"booked":0},
{"number":6,"booked":0},
{"number":7,"booked":0},
{"number":8,"booked":0}
]
Select query:
SELECT
COUNT(JSON_VALUE(jsonInfo,'$.booked'))
OVER(PARTITION BY JSON_VALUE(jsonInfo,'$.booked'))
FROM table
GROUP BY JSON_VALUE(jsonInfo,'$.booked')
ORDER BY JSON_VALUE(jsonInfo,'$.booked')

MS SQL json query/where clause nested array items

I have json data that i can query on using CROSS APPLY OPENJSON( which gets slow once you start adding multiple cross applies or once your json document get too large. So i wanted to add an index on the data im trying to filter on, but i cant get the syntax on nested array items to work with out using a cross apply. As such i cant create an index as you cant use a cross apply when making an index. According to the MS docs i should just be able to do
JSON_query(my_column, $.parentItem.nestedItemsArray1.nestedItemsArray2)
I should be able to get all the values of the nested, array items to then query on and improve performance by adding an index, something like this
ALTER TABLE mytable
ADD vdata AS JSON_query(my_column,
$.parentItem.nestedItemsArray1.nestedItemsArray2')
CREATE INDEX idx_json_my_column ON mytable(vdata)
but the above $.array.arrayitems syntax doesn't work ?
On a side note, I cant help but think in relational terms where normally in Sql you would index a column of data like so
col
---
1|
2|
3|
But json data seem to get flattened so when i use JSON_QUERY as per MS example i get "1,2,3" " I assume i want to incdex an array of values rather than a flattened version unless the index will return the inner data of the fattened data ?
my plug and play working example
declare #mydata table (
ID int NOT NULL,
jsondata varchar(max) NOT NULL
)
INSERT INTO #mydata (id, jsondata)
VALUES (789, '{ "Id": "12345", "FinanceProductResults": [ { "Term": 12, "AnnualMileage": 5000, "Deposits": 0, "ProductResults": [] }, { "Term": 18, "AnnualMileage": 30000, "Deposits": 15000, "ProductResults": [] }, { "Term": 24, "AnnualMileage": 5000, "Deposits": 0, "ProductResults": [ { "Key": "HP", "Payment": 460.28 } ] }, { "Term": 24, "AnnualMileage": 10000, "Deposits": 0, "ProductResults": [ { "Key": "HP", "Payment": 500.32 } ] }]}')
SELECT
j_Id
,JSON_query (c.value, '$.Term') as Term
,JSON_Value (c.value, '$.AnnualMileage') as AnnualMileage
,JSON_Value (c.value, '$.Deposits') as Deposits
,JSON_Value (p.value, '$.Key') as [Key]
,JSON_Value (p.value, '$.Payment') as Payment
--,c.value
FROM #mydata f
CROSS APPLY OPENJSON(f.jsondata)
WITH (j_Id nvarchar(100) '$.Id')
CROSS APPLY OPENJSON(f.jsondata, '$.FinanceProductResults') AS c
CROSS APPLY OPENJSON(c.value, '$."ProductResults"') AS p
where
ID = 789
AND JSON_Value (p.value, '$.Payment') = '460.28'
I'm using these MS docs to guide me :
How to create an index
How to get data
Update
I was able to improve performance slightly using the "with" method
SELECT
j_Id,
FinanceDetails.Term,
FinanceDetails.AnnualMileage,
FinanceDetails.Deposits,
Payments.Payment
FROM #mydata f
CROSS APPLY OPENJSON(f.jsondata)
WITH (j_Id nvarchar(100) '$.Id')
OUTER APPLY OPENJSON (f.jsondata, '$.FinanceProductResults' )
WITH (
Term INT '$.Term',
AnnualMileage INT '$.AnnualMileage',
Deposits INT '$.Deposits',
ProductResults NVARCHAR(MAX) '$.ProductResults' AS JSON
) AS FinanceDetails
OUTER APPLY OPENJSON(ProductResults, '$')
WITH (
Payment DECIMAL(19, 4) '$.Payment'
) AS Payments
WHERE
Payments.Payment = 460.28
but i still like to add an index on the sub array data to aid in improving performance ?
Currently, you cannot index nested properties.
Is Full-text search possible option? You might create FTS on JSON column and add predicate:
WHERE ....
AND CONTAINS( jsondata, 'NEAR(('Payments,460),1)')
Since JSON is text, this predicate will filter out all records that don't have something like "Payment" and 460 near to each other (this will identify key:value pairs), and you can apply CROSS APPLY on the reduced set of rows.

Creating a new table from grouped substring of existing table

I am having some trouble creating some SQL (for SQL server 2008).
I have a table of tasks that are priority ordered, comma delimited tasks:
Id = 1, LongTaskName = "a,b,c"
Id = 2, LongTaskName = "a,c"
Id = 3, LongTaskName = "b,c"
Id = 4, LongTaskName = "a"
etc...
I am trying to build a new table that groups them by the first task, along with the id:
GroupName: "a", TaskId: 1
GroupName: "a", TaskId: 2
GroupName: "a", TaskId: 4
GroupName: "b", TaskId: 3
Here is the naive, slow, linq code:
foreach(var t in Tasks)
{
var gt = new GroupedTasks();
gt.TaskId = t.Id;
var firstWord = t.LongTaskName.Split(',');
if(firstWord.Count() > 0)
{
gt.GroupName = firstWord.First();
}
else
{
gt.GroupName = t.LongTaskName;
}
GroupedTasks.InsertOnSubmit(gt);
}
I wrote a sql function to do the string split:
create function fn_Split(
#String nvarchar (4000),
#Delimiter nvarchar (10)
)
returns nvarchar(4000)
begin
declare #FirstComma int
set #FirstComma = charindex(#Delimiter,#String)
if(#FirstComma = 0)
return #String
return substring(#String, 0, #FirstComma)
end
go
However, I am getting stuck on the real sql to do the work.
I can get the group by alone:
SELECT dbo.fn_Split(LongTaskName, ',')
FROM [dbo].[Tasks]
GROUP BY dbo.fn_Split(LongTaskName, ',')
And I know I need to head down something like this:
DECLARE #RowSet TABLE (GroupName nvarchar(1024), Id nvarchar(5))
insert into #RowSet
select ???
FROM [dbo].Tasks as T
INNER JOIN
(
SELECT dbo.fn_Split(LongTaskName, ',')
FROM [dbo].[Tasks]
GROUP BY dbo.fn_Split(LongTaskName, ',')
) G
ON T.??? = G.???
ORDER BY ???
INSERT INTO dbo.GroupedTasks(GroupName, Id)
select * from #RowSet
But I am not quite groking how to reference the grouped relationships and am confused about having to call split multiple times.
Any thoughts?
If you only care about the first item in the list, there's no need really for a function. I would recommend this way. You also don't need the #RowSet table variable for any temporary holding.
INSERT dbo.GroupedTasks(GroupName, Id)
SELECT
LEFT(LongTaskName, COALESCE(NULLIF(CHARINDEX(',', LongTaskName)-1, -1), 1024)),
Id
FROM dbo.Tasks;
It is even easier if the tasks are 1-character long, you can use LEFT(LongTaskName, 1) instead of the ugly SUBSTRING/CHARINDEX mess. But I'm guessing your task names are not one character long (if this is the case, you should include some data that varies a bit so that others don't make assumptions about length).
Now, keep in mind that you'll have to do something like this to keep dbo.GroupedTasks up to date every time a dbo.Tasks row is inserted, updated or deleted. How are you going to keep these two tables in sync?
More to the point, you should consider storing the top priority task separately in the first place, either by using a computed column or separating it out before the insert. Munging data together is something that you do with hash tables and arrays in application code, but it rarely has any positive attributes inside a database. You almost always spend more time and effort extracting the data apart than you ever saved by keeping it together in the first place. This will negate the need for a second table at all.
Select Id, Split( ',', LongTaskName ) as GroupName into TasksWithGroupInfo
Does this answer your question?