How to extract characters from a string stored as json data and place them in dynamic number of columns in SQL Server - sql

I have a column of string in SQL Server that stores JSON data with all the braces and colons included.
My problem is to extract all the key and value pairs and store them in separate columns with the key as the column header. What makes this challenging is that every record has different number of key/value pairs.
For example in the image below showing 3 records, the first record has 5 key/value pairs- EndUseCommunityMarket of 2, EndUseProvincial Market of 0, and so on. The second record has 1 key/value pair, and the third record has two key/value pairs.
If I have to show how I want this in excel it would be like:
I have seen some SQL code examples that does something similar but for a fixed number of columns, unlike this one it varies for every record.
Please I need a SQL statement that can achieve this as I am working on thousands of records.
Below is this data copied from sql server:
catch_ext
{"NfdsFadMonitoring":{"EndUseEaten":1}}
{"NfdsFadMonitoring":{"EndUseCommunityMarket":3}}
{"NfdsFadMonitoring":{"SpeciesComment":"","EndUseCommunityMarket":2}}
{"NfdsFadMonitoring":{"SpeciesComment":"mix reef fis","EndUseEaten":31}}
{"NfdsFadMonitoring":{"SpeciesComment":"10 fish with a total of 18kg","EndUseCommunityMarket":0,"EndUseProvincialMarket":0,"EndUseUrbanMarket":8,"EndUseEaten":1,"EndUseGivenAway":1}}
{"NfdsFadMonitoring":{"SpeciesComment":"mix reef fis","EndUseEaten":18}}

I expect you don't want to dynamically create a table, instead you probably want to create a property mapping table. Here is a quick overview of the design.
Object table -- this stores the base information about your object
============
ID -- unique id field for every object.
Name
Property types table -- this stores all the property types
====================
Property_Type_ID -- unique type id
Description -- describes property
Object2Property -- stores the values for each property
===============
ObjectID -- the object
Property_Type_ID -- the property type
Value -- the value.
Using a model like this lets your properties be as dynamic as you wish but you don't have to create columns dynamically -- something that is hard and error prone.
using your specific example the tables would look like this
OBJECT
ID NAME
1 WHAOO
2 RED SNAMPPER
3 KAWAKAWA
Property Types
ID DESC
1 EndUseCommunityMarket
2 EndUseProvincialMarket
3 EndUseUrbanMarket
4 EndUseEaten
5 EndUseGivenAway
6 Comment
Map
ObjID TypeID Value
1 1 2
1 2 0
1 3 0
1 4 0
1 5 0
2 2 50
3 3 8
3 5 1

A. ROWS
Dynamic columns are a lot like rows.
You could use OPENJSON (Transact-SQL)
DECLARE #json2 NVARCHAR(4000) = N'{"NfdsFadMonitoring":{"SpeciesComment":"10 fish with a total of 18kg","EndUseCommunityMarket":0,"EndUseProvincialMarket":0,"EndUseUrbanMarket":8,"EndUseEaten":1,"EndUseGivenAway":1}}';
SELECT [key], value
FROM OPENJSON(#json2,'lax $.NfdsFadMonitoring')
Output
key value
SpeciesComment 10 fish with a total of 18kg
EndUseCommunityMarket 0
EndUseProvincialMarket 0
EndUseUrbanMarket 8
EndUseEaten 1
EndUseGivenAway 1
Your inputs
CREATE TABLE ForEloga (Id int,Json nvarchar(max));
Insert into ForEloga Values
(1,'{"NfdsFadMonitoring":{"EndUseEaten":1}}'),
(2,'{"NfdsFadMonitoring":{"EndUseCommunityMarket":3}}'),
(3,'{"NfdsFadMonitoring":{"SpeciesComment":"","EndUseCommunityMarket":2}}'),
(4,'{"NfdsFadMonitoring":{"SpeciesComment":"mix reef fis","EndUseEaten":31}}'),
(5,'{"NfdsFadMonitoring":{"SpeciesComment":"10 fish with a total of 18kg","EndUseCommunityMarket":0,"EndUseProvincialMarket":0,"EndUseUrbanMarket":8,"EndUseEaten":1,"EndUseGivenAway":1}}'),
(6,'{"NfdsFadMonitoring":{"SpeciesComment":"mix reef fis","EndUseEaten":18}}');
SELECT Id, [key], value
FROM ForEloga CROSS APPLY OPENJSON(Json,'lax $.NfdsFadMonitoring')
Output
Id key value
1 EndUseEaten 1
2 EndUseCommunityMarket 3
3 SpeciesComment
3 EndUseCommunityMarket 2
4 SpeciesComment mix reef fis
4 EndUseEaten 31
5 SpeciesComment 10 fish with a total of 18kg
5 EndUseCommunityMarket 0
5 EndUseProvincialMarket 0
5 EndUseUrbanMarket 8
5 EndUseEaten 1
5 EndUseGivenAway 1
6 SpeciesComment mix reef fis
6 EndUseEaten 18
B. COLUMNS: CROSS APPLY WITH WITH
If you know all possible properties then I recommend CROSS APPLY with WITHas shown in Example 3 - Join rows with JSON data stored in table cells using CROSS APPLY in OPENJSON (Transact-SQL).
SELECT store.title, location.street, location.lat, location.long
FROM store
CROSS APPLY OPENJSON(store.jsonCol, 'lax $.location')
WITH (
street varchar(500),
postcode varchar(500) '$.postcode',
lon int '$.geo.longitude',
lat int '$.geo.latitude'
) AS location

Try this:
Table Schema:
CREATE TABLE #JsonValue(sp_name VARCHAR(100),catch_ext VARCHAR(1000))
INSERT INTO #JsonValue VALUES ('WAHOO','{"NfdsFadMonitoring":{"EndUseEaten":1}}')
INSERT INTO #JsonValue VALUES ('RUBY SNAPPER','{"NfdsFadMonitoring":{"EndUseCommunityMarket":3}}')
INSERT INTO #JsonValue VALUES ('KAWAKAWA','{"NfdsFadMonitoring":{"SpeciesComment":"","EndUseCommunityMarket":2}}')
INSERT INTO #JsonValue VALUES ('XXXXXXXX','{"NfdsFadMonitoring":{"SpeciesComment":"mix reef fis","EndUseEaten":31}}')
INSERT INTO #JsonValue VALUES ('YYYYYYYY','{"NfdsFadMonitoring":{"SpeciesComment":"10 fish with a total of 18kg","EndUseCommunityMarket":0,"EndUseProvincialMarket":0,"EndUseUrbanMarket":8,"EndUseEaten":1,"EndUseGivenAway":1}}')
INSERT INTO #JsonValue VALUES ('ZZZZZZZZZZ','{"NfdsFadMonitoring":{"SpeciesComment":"mix reef fis","EndUseEaten":18}}')
Query:
SELECT sp_name
,ISNULL(MAX(CASE WHEN [Key]='EndUseCommunityMarket' THEN Value END),'')EndUseCommunityMarket
,ISNULL(MAX(CASE WHEN [Key]='EndUseProvincialMarket' THEN Value END),'')EndUseProvincialMarket
,ISNULL(MAX(CASE WHEN [Key]='EndUseUrbanMarket' THEN Value END),'')EndUseUrbanMarket
,ISNULL(MAX(CASE WHEN [Key]='EndUseEaten' THEN Value END),'')EndUseEaten
,ISNULL(MAX(CASE WHEN [Key]='EndUseGivenAway' THEN Value END),'')EndUseGivenAway
FROM(
SELECT sp_name, [key], value
FROM #JsonValue CROSS APPLY OPENJSON(catch_ext,'$.NfdsFadMonitoring')
)D
GROUP BY sp_name
Output:
sp_name EndUseCommunityMarket EndUseProvincialMarket EndUseUrbanMarket EndUseEaten EndUseGivenAway
------------- --------------------- ---------------------- ----------------- ----------- ---------------
KAWAKAWA 2
RUBY SNAPPER 3
WAHOO 1
XXXXXXXX 31
YYYYYYYY 0 0 8 1 1
ZZZZZZZZZZ 18
Hope this will help you.

Related

Use JSON array in SQL server instead of traditional relational database, each value in 1 row

I have saved the answer values in a table in rows, 1 answer 1 row, 5 rows in this example.
If I migrate it to JSON it will be 2 rows(JSON)
Table
Id
Optionsid
Pid
Column
1
2
1
null
2
1
2
null
3
1
2
null
4
2
2
null
5
3
1
null
I want to calculate how many answers(pid) for each Optionsid with
SELECT COUNT(pid)AS Counted, OptionsId
FROM Answer GROUP BY [Column], OptionsId
Table Results
Counted
Optionsid
2
1
2
2
1
3
I have run thus query and saved it in a new table
select * from Answer for Json Auto
Json Table I added {"answer":} to the Json
id
pid
json
1
1
{"Answer":[{"Id":1,"Optionsid":2,"Pid":1}]}
2
2
{"Answer":[{"Id":2,"Optionsid":1,"Pid":2},{"Id":2,"Optionsid":1,"Pid":2},{"Id":3,"Optionsid":2,"Pid":2},{"Id":4,"Optionsid":3,"Pid":2}]}
I want to get the same result from Json Table as the Table result above, but I can get it to work
This Query only take the first[0] in the array, i want a query who take all values in the array.
Can someone help me with this query?
Select Count(Json_value ([json], '$.Answer[0].Pid')) as Counted,
Json_value ([json], '$.Answer[0].Optionsid') as OptionsId
from [PidJson]
group by Json_value ([json],'$.Answer[0].Column'),Json_value
([json],'$.Answer[0].Optionsid')
Here is a fiddle if you want to see
https://dbfiddle.uk/?rdbms=sqlserver_2017&fiddle=0a2df33717a3917bae699ea3983b70b4
Here is the solution
SELECT Count(JsonData.Pid) as Counted,
JsonData.Optionsid
FROM PidJson AS RelationsTab
CROSS APPLY OPENJSON (RelationsTab.json,
N'$.Answer')
WITH (
Pid VARCHAR(200) N'$.Pid',
Optionsid VARCHAR(200) N'$.Optionsid',
ColumnValue INT N'$.Column'
) AS JsonData
Group by JsonData.ColumnValue, JsonData.Optionsid
Thanks for your time and that you "force" me to clearify my question and I find the solution

How to convert column values to single row value in postgresql

Data
id cust_name
1 walmart_ca
2 ikea_mo
2 ikea_ca
2 ikea_in
1 walmart_in
when i do
select id,cust_name from test where id=2
Query returns below output
id cust_name
2 ikea_mo
2 ikea_ca
2 ikea_in
How can i get or store the result as single column value as shown below
id cust_name
2 {ikea_mo,ikea_ca,ikea_in}
you should use string_agg function and here is an example for it
select string_agg(field_1,',') from db.schema.table;
you should mention the separator in your case its , so I am doing string_agg(field_1,',')

List category/subcategory tree and display its sub-categories in the same row

I have a hierarchical table of Regions and sub-regions, and I need to list a tree of regions and sub-regions (which is easy), but also, I need a column that displays, for each region, all the ids of it's sub regions.
For example:
id name superiorId
-------------------------------
1 RJ NULL
2 Tijuca 1
3 Leblon 1
4 Gavea 2
5 Humaita 2
6 Barra 4
I need the result to be something like:
id name superiorId sub-regions
-----------------------------------------
1 RJ NULL 2,3,4,5,6
2 Tijuca 1 4,5,6
3 Leblon 1 null
4 Gavea 2 4
5 Humaita 2 null
6 Barra 4 null
I have done that by creating a function that retrieves a STUFF() of a region row,
but when I'm selecting all regions from a country, for example, the query becomes really, really slow, since I execute the function to get the region sons for each region.
Does anybody know how to get that in an optimized way?
The function that "retrieves all the ids as a row" is:
I meant that the function returns all the sub-region's ids as a string, separated by a comma.
The function is:
CREATE FUNCTION getSubRegions (#RegionId int)
RETURNS TABLE
AS
RETURN(
select stuff((SELECT CAST( wine_reg.wine_reg_id as varchar)+','
from (select wine_reg_id
, wine_reg_name
, wine_region_superior
from wine_region as t1
where wine_region_superior = #RegionId
or exists
( select *
from wine_region as t2
where wine_reg_id = t1.wine_region_superior
and (
wine_region_superior = #RegionId
)
) ) wine_reg
ORDER BY wine_reg.wine_reg_name ASC for XML path('')),1,0,'')as Sons)
GO
When we used to make these concatenated lists in the database we took a similar approach to what you are doing at first
then when we looked for speed
we made them into CLR functions
http://msdn.microsoft.com/en-US/library/a8s4s5dz(v=VS.90).aspx
and now our database is only responsible for storing and retrieving data
this sort of thing will be in our data layer in the application

Oracle Partitioned Sequence

I'm trying to see if exists something to create a sequence with partition logic.
I need a sequence number that depend on other primary key ex:
id_person sequence id
1 | 1
1 | 2
2 | 1
3 | 1
1 | 3
so the sequence must depend on the id_person partition. Is there something like this on oracle or i must implement it by myself on the application level?
thank you.
Hi have create this PLSQL package one function and procedure:
PROCEDURE INIT_SEQUENCE(NAME varchar2, pkColumnNameList PARTITIONED_SEQUENCE_PK_COLUMN);
FUNCTION GET_NEXT_SEQUENCE_VALUE(NAME varchar2, pkPartitionColValue PARTITIONED_SEQUENCE_COL_VALUE) RETURN NUMBER;
INIT_SEQUENCE - get in input the name to associate at the sequence and a list of column name that are the fixed primary key part that vincolate the sequence Ex:'ID_PERSON'
the work of this procedure is to create the table that will manage the increment of sequence according to pkColumnNameList column.
GET_NEXT_SEQUENCE_VALUE- get the name of sequence to increment and the value of pkColumnNameList primary key and make the next step:
1) Create dynamically the sql to work
2) dbms_lock.allocate_unique(); to lock the table
3) check if is present a record in the table for pk value in input
4) if a record is present update the record with max + 1 in the sequence column
5) if a record is not present insert the new record with the 1 in the sequence column
6) return new id;
i would like to receive comment about this thanks in advance...
Is the actual requirement that the secondary sequence be gap free? If so, you've got a giant serialization/scalability issue.
If you need to present a gap-free sequence for human consumption, you could use an actual sequence (or a timestamp, for that matter) as Nick Pierpont suggests and preserve scalability, you could use analytic functions.
Dataset (t1):
ID_PERSON SEQUENCE_ID
---------- -----------
1 1
2 2
3 3
1 4
1 5
1 6
2 7
3 8
1 9
SQL:
select *
from
(select id_person,
sequence_id as orig_sequence_id,
rank ()
over (partition by id_person
order by sequence_id)
as new_sequence_id
from t1
)
order by id_person, new_sequence_id;
Result:
ID_PERSON ORIG_SEQUENCE_ID NEW_SEQUENCE_ID
---------- ---------------- ---------------
1 1 1
1 4 2
1 5 3
1 6 4
1 9 5
2 2 1
2 7 2
3 3 1
3 8 2
I'm afraid you have to do it like this:
INSERT INTO t
(
id_person,
sequence_id
)
VALUES
(
<your_person_id>,
( SELECT 1 + NVL( MAX( sequence_id ), 0 )
FROM t
WHERE t.id_person = <your_person_id>
)
)
What you are looking for is not a sequence, as the Oracle Documentation claims: "The sequence generator provides a sequential series of numbers".
You are looking for a calculated field depending on another, in this case the primary key. As other suggested you need to add the logic on your code. It means in a procedure or in the insert sentence.

Getting a comma-delimited list of PK's for duplicates of a record in SQL Server 2005?

This is an off-shoot of a previous question I had: A little fuzzy on getting DISTINCT on one column?
This query makes a little more sense, given the data:
SELECT Receipts.ReceiptID, FolderLink.ReceiptFolderID
FROM dbo.tbl_ReceiptFolderLnk AS FolderLink INNER JOIN
dbo.tbl_Receipt AS Receipts ON FolderLink.ReceiptID = Receipts.ReceiptID
With results:
ReceiptID ReceiptFolderID NewColumn (duplicate folder ID list)
-------------------- --------------- ----------
1 3
2 3
3 7
4 <---> 4 8,9
5 4
6 1
3 8
4 <---> 8 4,9
4 <---> 9 4,8
That answer provided me to view distinct(ReceiptID)'s... great. Now, for those ID's, 3 and 4, they exist in multiple ReceiptFolderID's.
Given this NON-unique list of ReceiptID's, I'd like an additional column, of comma-delimited ReceiptFolderLinkID's where the ReceiptID also exists.
So for ReceiptID=4, the new column, say, DuplicateFoldersList, should read, "8,9", etc, and similar with ID=3, or any other duplicates.
So basically, I'd like another column to indicate the ReceiptFolderID's additional occurrences of ReceiptID in other folders.
Thanks!
You can create a function that, given a ReceiptID and the "current" ReceiptFolderID for that row, returns the other ReceiptFolderIDs as a concatenated, comma-delimited list. Example:
CREATE FUNCTION [dbo].[GetOtherReceiptFolderIDs](#receiptID int, #receiptFolderID int)
RETURNS varchar(MAX) AS
BEGIN
DECLARE #returnValue varchar(MAX)
SELECT #returnValue = COALESCE(#returnValue + ', ', '') + COALESCE(CONVERT(varchar(MAX), ReceiptFolderID), '')
FROM tbl_ReceiptFolderLink AS FolderLink
WHERE FolderLink.ReceiptID = #receiptID
AND FolderLink.ReceiptFolderID <> #receiptFolderID
RETURN #returnValue
END
Then, you can run a query that uses this function to obtain your new column:
SELECT Receipts.ReceiptID, ReceiptFolderID, dbo.GetOtherReceiptFolderIDs(Receipts.ReceiptID, ReceiptFolderID) AS NewColumn
FROM tbl_Receipt AS Receipts
INNER JOIN tbl_ReceiptFolderLink AS FolderLinks
ON Receipts.ReceiptID = FolderLinks.ReceiptID
I tested this and it produces the following results (if I got your schema correctly):
ReceiptID ReceiptFolderID NewColumn
6 1 NULL
1 3 NULL
2 3 NULL
4 4 8, 9
5 4 NULL
3 7 8
3 8 7
4 8 4, 9
4 9 4, 8
In Mysql there is group_concat aggregate function, but in T-SQL and oracle you need to use another approach... This site lists multiple approaches for T-SQL, but none are very simple and easy (as mysql is)