SQL OPENJSON Array in Objects - sql

I have a table with the following structure/rows:
ID
OptionName
OptionValue
0
Gender
Male
1
Gender
Female
I want to query the database and return the following JSON:
[{
"OptionName":"Gender",
"Values":[
"Male",
"Female"
]
}]
However the result I'm currently getting is this:
[{
"OptionName":"Gender",
"Values":[
{
"OptionValue":"Male"
},
{
"OptionValue":"Female"
}
]
}]
Here is my Query:
SELECT TOP(1) OptionName,
(
JSON_QUERY(
(
SELECT OptionValue
FROM [TestJSON].[dbo].[Options]
WHERE OptionName = 'Gender'
FOR JSON PATH
)
)
) AS [Values]
FROM [TestJSON].[dbo].[Options]
WHERE OptionName = 'Gender'
FOR JSON PATH
What can I do to get the result I need?

Although SQL Server 2022 introduced JSON_ARRAY() function, it's difficult to use it to build a JSON array with variable items count, so you may try a string based approach:
SELECT DISTINCT o.OptionName, JSON_QUERY(a.[Values]) AS [Values]
FROM Options o
CROSS APPLY (
SELECT CONCAT('[', STRING_AGG(CONCAT('"', OptionValue, '"'), ','), ']')
FROM Options
WHERE OptionName = o.OptionName
) a ([Values])
--WHERE o.OptionName = 'Gender'
FOR JSON PATH
If you know the maximum number of values for each option (5 in the example), a combination of JSON_ARRAY() and PIVOT relational operator is another option:
SELECT OptionName, JSON_ARRAY([1], [2], [3], [4], [5]) AS [Values]
FROM (
SELECT OptionName, OptionValue, ROW_NUMBER() OVER (PARTITION BY OptionName ORDER BY ID) AS Rn
FROM Options
) t
PIVOT (MAX(OptionValue) FOR Rn IN ([1], [2], [3], [4], [5])) p
FOR JSON PATH

Related

SQL Server : parsing a JSON column with irregular values

I have SQL Server 2016 (v13) installation where I am trying to parse a column with JSON data. The data in the column RequestData is in the following format:
[
{ "Name": "SourceSystem", "Value": "SSValue" },
{ "Name": "SourceSystemId", "Value": "XYZ" }
]
[
{ "Name": "SourceSystemId", "Value": "SSID" },
{ "Name": "SourceSystem", "Value": "SSVALUE2" }
]
What I need to get are the values for the SourceSystem element of the JSON object in each row. And here is my Select statement:
SELECT TOP 2
JSON_VALUE(RequestData, '$[0].Value') AS SourceSystem
FROM
RequestDetail
But, due to the order of the JSON elements in the column's data, the values being returned for the SourceSystem column are not correct.
SSValue, SSID
Please note, I need to be able to parse the JSON elements so that the SourceSystem column will have correct values, i.e SSValue and SSValue2.
I have also tried JSON_Query using some online examples but no luck so far.
Thank you!
Edit
The Question has been modified by someone after I posted, so I am adding this for clarification: Each row of data, as given in the Question, will have several 'Name' elements and those Name elements can be SourceSystem or SourceSystemId. The Question shows data from two rows from the database table's column, but, as you can see, the SourceID and SourceSystemId elements in each row are not in the same order between the first and the second row. I simply need to parse SourceSystem element per row.
Using openjson, to get all the data in columns you can use it as any othe table
SELECT
Value
FROM RequestDetail
CROSS APPLY OPENJSON(RequestDetail.RequestData)
WITH (Name nvarchar(20),
Value nvarchar(20))
WHERE Name = 'SourceSystem';
Value
SSValue
SSVALUE2
fiddle
Presumably you need OPENJSON here, not JSON_VALUE:
SELECT *
FROM (VALUES(N'[{"Name":"SourceSystem","Value":"SSValue"},{"Name":"SourceSystemId","Value":"XYZ"}]'),
(N'[{"Name":"SourceSystemId","Value":"SSID"},{"Name":"SourceSystem","Value":"SSVALUE2"}]'))V(YourJSON)
CROSS APPLY OPENJSON(V.YourJSON)
WITH (Value nvarchar(20));
When you want to use JSON_VALUE, just select the correct (needed) values:
SELECT
JSON_VALUE(RequestData, '$[0].Value') AS SourceSystem
FROM RequestDetail
UNION ALL
SELECT
JSON_VALUE(RequestData, '$[1].Value') AS SourceSystem
FROM RequestDetail
output:
SourceSystem
SSValue
SSID
XYZ
SSVALUE2
When you only need values from "SourceSystem", you can always do:
SELECT SourceSystem
FROM (
SELECT
JSON_VALUE(RequestData, '$[0].Name') AS Name,
JSON_VALUE(RequestData, '$[0].Value') AS SourceSystem
FROM RequestDetail
UNION ALL
SELECT
JSON_VALUE(RequestData, '$[0].Name') AS Name,
JSON_VALUE(RequestData, '$[1].Value') AS SourceSystem
FROM RequestDetail )x
WHERE Name='SourceSystem';
output:
SourceSystem
SSValue
XYZ
see: DBFIDDLE
EDIT:
SELECT
x,
MIN(CASE WHEN Name='SourceSystem' THEN SourceSystem END) as SourceSystem,
MIN(CASE WHEN Name='SourceSystemId' THEN SourceSystem END) as SourceSystemId
FROM (
SELECT
ROW_NUMBER() OVER (ORDER BY RequestData) as x,
JSON_VALUE(RequestData, '$[0].Name') AS Name,
JSON_VALUE(RequestData, '$[0].Value') AS SourceSystem
FROM RequestDetail
UNION ALL
SELECT
ROW_NUMBER() OVER (ORDER BY RequestData) as x,
JSON_VALUE(RequestData, '$[1].Name') AS Name,
JSON_VALUE(RequestData, '$[1].Value') AS SourceSystem
FROM RequestDetail
)x
GROUP BY x
;
This will give:
x
SourceSystem
SourceSystemId
1
SSValue
XYZ
2
SSVALUE2
SSID

SQL Server use column value as column names and convert to json array

Let's say I have the following table ProductValues:
ProductID
Name
Value
1
Market
A
1
Customer
B
2
Market
C
2
Customer
D
I'm able to group them by their ProductID and get these values as an array with the following code:
SELECT
(
SELECT Name, Value FROM ProductValues
WHERE P.ID = ProductID
FOR JSON PATH
)
FROM #ProductIDs P '#ProductIDs is a table containing the productIDs that Id like to retrieve'
This returns the following:
(No column name)
[{"Name":"Market","Value":"A"},{"Name":"Customer","Value":"B"}]
[{"Name":"Market","Value":"C"},{"Name":"Customer","Value":"D"}]
I would like to dynamically create key value pairs using Pivot. I want to achieve the following:
(No column name)
[{"Market":"A"},{"Customer":"B"}]
[{"Market":"C"},{"Customer":"D"}]
Looking at another answer, I tried the following, but this doesn't set the keys dynamically and won't execute (states that "Value" and "TechName" in the Pivot are undefined):
SELECT(
SELECT Market, Customer
FOR JSON PATH
)
FROM(
SELECT(
SELECT Name, Value FROM ProductValues
WHERE ProductID = P.ID
)
FROM #ProductIDs P
) t
PIVOT(
MAX(Value) '<--- "Value" Undefined'
FOR Name IN ( '<--- "Name" Undefined'
Market, Customer
)
) AS pvt
GROUP BY
Market, Customer
You can pivot with conditional aggregation, the convert to JSON:
select (
select
max(case when name = 'Market' then value end) as market,
max(case when name = 'Customer' then value end) as customer
from productvalues pv
where pv.productid = p.productid
for json path
) as js
from #ProductIDs p
Here is a demo on DB Fiddle.
SQL Server is declarative by design. If you are looking for dynamic columns, you will need DYNAMIC SQL.
Example
Declare #sql nvarchar(max) = stuff( (Select Distinct ','+QUOTENAME(Name) From ProductValues FOR XML PATH('')),1,1,'')
SET #sql = 'Select B.*
From (
SELECT '+#sql+'
FROM ProductValues
PIVOT (max([Value]) FOR [Name] IN ('+#sql+')) AS pvt
) A
Cross Apply ( (Select A.* for json path ) ) B (JSONData)'
exec(#sql)
Returns
JSONData
[{"Customer":"B","Market":"A"}]
[{"Customer":"D","Market":"C"}]

Parse JSON array inside an object in T-SQL

I have a json array like this:
{
"TagsDictionary": [
{
"key" : "property1",
"value" : "property1Value"
},
{
"key" : "property2",
"value" : "property2Value"
}
]
}
I need to query it in T-SQL. I want this result:
property1 | property2
property1Value | property2Value
I read docs. But I can't achieve result with my structure. Unfortunately the JSON is stored in the database and I can't change it, because other modules depends on it.
I'm using SQL Server 2016.
Not 100% sure this is what you are looking for
Example
Declare #YourTable table (ID int,JSON varchar(max))
Insert Into #YourTable values
(1,'{"TagsDictionary": [{"key" : "property1","value" : "property1Value"},{"key" : "property2","value" : "property2Value"}]}')
;with cte as (
Select A.ID
,RN = row_number() over (partition by id,Indx order by B.[Key])
,CN = Indx+1
,B.[value]
From #YourTable A
Cross Apply (
Select Indx = B.[key]
,C.[key]
,C.[value]
From OpenJSON(A.JSON) A
Cross Apply OpenJSON(A.Value) B
Cross Apply OpenJSON(B.Value) C
) B
)
Select *
From cte
Pivot ( max(value) for CN in ([1],[2]) ) pvt
Returns
ID RN 1 2
1 1 property1 property2
1 2 property1Value property2Value

why Snowflake changing the order of JSON values when converting into flatten list?

I have JSON objects stored in the table and I am trying to write a query to get the first element from that JSON.
Replication Script
create table staging.par.test_json (id int, val varchar(2000));
insert into staging.par.test_json values (1, '{"list":[{"element":"Plumber"},{"element":"Craft"},{"element":"Plumbing"},{"element":"Electrics"},{"element":"Electrical"},{"element":"Tradesperson"},{"element":"Home services"},{"element":"Housekeepings"},{"element":"Electrical Goods"}]}');
insert into staging.par.test_json values (2,'
{
"list": [
{
"element": "Wholesale jeweler"
},
{
"element": "Fashion"
},
{
"element": "Industry"
},
{
"element": "Jewelry store"
},
{
"element": "Business service"
},
{
"element": "Corporate office"
}
]
}');
with cte_get_cats AS
(
select id,
val as category_list
from staging.par.test_json
),
cats_parse AS
(
select id,
parse_json(category_list) as c
from cte_get_cats
),
distinct_cats as
(
select id,
INDEX,
UPPER(cast(value:element AS varchar)) As c
from
cats_parse,
LATERAL flatten(INPUT => c:"list")
order by 1,2
) ,
cat_array AS
(
SELECT
id,
array_agg(DISTINCT c) AS sds_categories
FROM
distinct_cats
GROUP BY 1
),
sds_cats AS
(
select id,
cast(sds_categories[0] AS varchar) as sds_primary_category
from cat_array
)
select * from sds_cats;
Values: Categories
{"list":[{"element":"Plumber"},{"element":"Craft"},{"element":"Plumbing"},{"element":"Electrics"},{"element":"Electrical"},{"element":"Tradesperson"},{"element":"Home services"},{"element":"Housekeepings"},{"element":"Electrical Goods"}]}
Flattening it to a list gives me
["Plumber","Craft","Plumbing","Electrics","Electrical","Tradesperson","Home services","Housekeepings","Electrical Goods"]
Issue:
The order of this is not always same. Snowflake seems to change the ordering sometimes snowflake changes the order as per the alphabet.
How can I make this static. I do not want the order to be changed.
The problem is the way you're using ARRAY_AGG:
array_agg(DISTINCT c) AS sds_categories
Specifying it like that gives Snowflake no guidelines on how the contents of array should be arranged. You should not assume that the arrays will be created in the same order as their input records - it might, but it's not guaranteed. So you probably want to do
array_agg(DISTINCT c) within group (order by index) AS sds_categories
But that won't work, as if you use DISTINCT c, the value of index for each c is unknown. Perhaps you don't need DISTINCT, then this will work
array_agg(c) within group (order by index) AS sds_categories
If you do need DISTINCT, you need to somehow associate an index with a distinct c value. One way is to use a MIN function on index in the input. Here's a full query
with cte_get_cats AS
(
select id,
val as category_list
from staging.par.test_json
),
cats_parse AS
(
select id,
parse_json(category_list) as c
from cte_get_cats
),
distinct_cats as
(
select id,
MIN(INDEX) AS index,
UPPER(cast(value:element AS varchar)) As c
from
cats_parse,
LATERAL flatten(INPUT => c:"list")
group by 1,3
) ,
cat_array AS
(
SELECT
id,
array_agg(c) within group (order by index) AS sds_categories
FROM
distinct_cats
GROUP BY 1
),
sds_cats AS
(
select id,
cast(sds_categories[0] AS varchar) as sds_primary_category
from cat_array
)
select * from cat_array;

SQL Pivot not quite right

Maybe I am just not awake enough...I know I have done something like this in the past and when I look at the other answers here I think I am doing the same thing but I am not getting the expected results.
I have this query:
SELECT
tPM.mast_rel as Mast_Rel
, row_Number() over(Partition by tPM.Mast_rel Order by tPM.Mast_rel) as CategoryCount
, S.[RC_TRANS] as [Category]
, SUM(P.[VAL]) as [Value]
FROM #caselist AS tPM
INNER JOIN [TIBURON].[PARSProperty] AS P ON tPM.[MAST_REL] = P.[MAST_REL]
--S.RC_KEY equals combination of P.CAT-P.ART when P.CAT ='Y' otherwise just P.CAT = RC_KEY
LEFT JOIN [TIBURON].[SSCTAB] AS S ON (CASE
WHEN P.[CAT] = 'Y' THEN P.[CAT] + '-' + P.[ART]
ELSE P.[CAT]
END) = S.[RC_KEY] AND S.[RC_TYPE] = 'CP'
WHERE P.[P_INVL] != 'EVD' and S.[RC_TRANS] is not null
GROUP BY tPM.mast_rel, S.[RC_TRANS]
which gives me these results:
I want to pivot them so I get a single Mast_Rel with three columns of the categories
select Mast_Rel,[1], [2], [3]
from
(
SELECT
tPM.mast_rel as Mast_Rel
, row_Number() over(Partition by tPM.Mast_rel Order by tPM.Mast_rel) as CategoryCount
, S.[RC_TRANS] as [Category]
, SUM(P.[VAL]) as [Value]
FROM #caselist AS tPM
INNER JOIN [TIBURON].[PARSProperty] AS P ON tPM.[MAST_REL] = P.[MAST_REL]
--S.RC_KEY equals combination of P.CAT-P.ART when P.CAT ='Y' otherwise just P.CAT = RC_KEY
LEFT JOIN [TIBURON].[SSCTAB] AS S ON (CASE
WHEN P.[CAT] = 'Y' THEN P.[CAT] + '-' + P.[ART]
ELSE P.[CAT]
END) = S.[RC_KEY] AND S.[RC_TYPE] = 'CP'
WHERE P.[P_INVL] != 'EVD' and S.[RC_TRANS] is not null
GROUP BY tPM.mast_rel, S.[RC_TRANS]
)
src
pivot
(
max(Category) for CategoryCount in ([1], [2], [3])
) piv
order by 1;
but instead of getting a single row, I am getting each one on its own row:
Additionally, I need to have a "total" for the Value column also on the pivot. So ultimately I would like a single record that shows:
Can anyone help me tweak my query to get the results I need?
Thank you!
Edit:
Here is a script that will create the data and the current results:
declare #results table (Mast_Rel varchar(100), CategoryCount varchar(10), Category varchar(100) , [Value] varchar(100))
insert into #results (Mast_rel, CategoryCount, Category, [Value])
values
('1602030055590P2404','1','Money','80.00'),
('1602051033480P3481','1','Miscellaneous/other (None of the above)','1000.00'),
('1602051033480P3481','2','Personal accessories (incl serial jewelry)','5000.00'),
('1602051033480P3481','3','Radio, TV, and sound entertainment devices',''),
('1602070005106P2804','1','Miscellaneous/other (None of the above)',''),
('1602080020374P3352','1','Money','128.09'),
('1602080020374P3352','2','Radio, TV, and sound entertainment devices',''),
('1602132349110P5208','1','Money','160.00'),
('1602132349110P5208','2','Radio, TV, and sound entertainment devices',''),
('1602171004296P3848','1','Consumable Goods','21.73'),
('1602201425504P2876','1','Radio, TV, and sound entertainment devices',''),
('16022115223610P3282','1','Consumable Goods','60.00'),
('16022115223610P3282','2','Money','300.00'),
('16022115223610P3282','3','Narcotic Equipment/Paraphernalia','10.00'),
('1602250140284P2804','1','Money','165.00'),
('1602250140284P2804','2','Radio, TV, and sound entertainment devices',''),
('16022916203812P2702','1','Guns/Firearms',''),
('16022916203812P2702','2','Radio, TV, and sound entertainment devices','')
select Mast_Rel,[1], [2], [3]
from
(
SELECT
* from #results
)
src
pivot
(
max(Category) for CategoryCount in ([1], [2], [3])
) piv
order by 1;
You should be able to alter your original query to get the result you want. The problem lies with the GROUP BY and the SUM() in it. Since you are grouping by S.[RC_TRANS] for the SUM() you are returning multiple rows which is altering the final result of your PIVOT.
You could remove the GROUP BY in the inner subquery and use SUM() OVER() instead. Changing your original query to the following should get you the result you want:
select Mast_Rel,[1], [2], [3], [Value]
from
(
SELECT
tPM.mast_rel as Mast_Rel
, row_Number() over(Partition by tPM.Mast_rel Order by tPM.Mast_rel) as CategoryCount
, S.[RC_TRANS] as [Category]
-- change the following line
, SUM(P.[VAL]) OVER(PARTITION BY tPM.Mast_rel) as [Value]
FROM #caselist AS tPM
INNER JOIN [TIBURON].[PARSProperty] AS P ON tPM.[MAST_REL] = P.[MAST_REL]
--S.RC_KEY equals combination of P.CAT-P.ART when P.CAT ='Y' otherwise just P.CAT = RC_KEY
LEFT JOIN [TIBURON].[SSCTAB] AS S ON (CASE
WHEN P.[CAT] = 'Y' THEN P.[CAT] + '-' + P.[ART]
ELSE P.[CAT]
END) = S.[RC_KEY] AND S.[RC_TYPE] = 'CP'
WHERE P.[P_INVL] != 'EVD' and S.[RC_TRANS] is not null
)
src
pivot
(
max(Category) for CategoryCount in ([1], [2], [3])
) piv
order by 1;
By changing from SUM(P.[VAL]) with a GROUP BY to SUM(P.[VAL]) OVER(PARTITION BY tPM.Mast_rel) you're getting the total sum for each tPM.Mast_rel which is what you're trying to return in the final result set. The SUM(P.[VAL]) OVER should calculate the same value for each row in the Mast_Rel which then will not create multiple rows in the final result set.
I would do conditional aggregation :
select mast_rel,
max(case when categorycount = 1 then category end),
max(case when categorycount = 2 then category end),
max(case when categorycount = 3 then category end),
sum(value)
from #results r
group by mast_rel;
Not sure as I cannot test it directly,.. but what happens when you add a max(Mast_Rel) to the pivot-clause?
pivot (
max(Mast_Rel), max(Category) for CategoryCount in ([1], [2], [3])
) piv
You can wrap it in another GROUP BY
SELECT Mast_Rel, MAX([1]) AS [1], MAX([2]) AS [2], MAX([3]) AS [3]
FROM
(
SELECT *
FROM #results
) src
PIVOT
(
MAX(Category) FOR CategoryCount IN ([1], [2], [3])
) piv
GROUP BY Mast_Rel
ORDER BY Mast_Rel;