Extracting specific value from large string SQL - sql

I've used a combination of CHARINDEX and SUBSTRING but can't get it working.
I get passed a variable in SQL that contains a lot of text but has an email in it. I need to extract the email value.
I have to use SQL 2008.
I'm trying to extract the value between "EmailAddress":" and ",
An example string is here:
{ "Type":test,
"Admin":test,
"User":{
"UserID":"16959191",
"FirstName":"Test",
"Surname":"Testa",
"EmailAddress":"Test.Test#test.com",
"Address":"Test"
}
}

Assuming you can't upgrade to 2016 or higher, you can use a combination of substring and charindex.
I've used a common table expression to make it less cumbersome, but you don't have to.
DECLARE #json varchar(4000) = '{ "Type":test,
"Admin":test,
"User":{
"UserID":"16959191",
"FirstName":"Test",
"Surname":"Testa",
"EmailAddress":"Test.Test#test.com",
"Address":"Test"
}
}';
WITH CTE AS
(
SELECT #Json as Json,
CHARINDEX('"EmailAddress":', #json) + LEN('"EmailAddress":') As StartIndex
)
SELECT SUBSTRING(Json, StartIndex, CHARINDEX(',', json, StartIndex) - StartIndex)
FROM CTE
Result: "Test.Test#test.com"

The first hint is: Move to v2016 if possible to use JSON support natively. v2008 is absolutely outdated...
The second hint is: Any string action (and all my approaches below will need some string actions too), will suffer from forbidden characters, unexpected blanks or any other surprise you might find within your data.
Try it like this:
First I create a mockup scenario to simulate your issue
DECLARE #tbl TABLE(ID INT IDENTITY,YourJson NVARCHAR(MAX));
INSERT INTO #tbl VALUES
(N'{ "Type":"test1",
"Admin":"test1",
"User":{
"UserID":"16959191",
"FirstName":"Test1",
"Surname":"Test1a",
"EmailAddress":"Test1.Test1#test.com",
"Address":"Test1"
}
}')
,(N'{ "Type":"test2",
"Admin":"test2",
"User":{
"UserID":"16959191",
"FirstName":"Test2",
"Surname":"Test2a",
"EmailAddress":"Test2.Test2#test.com",
"Address":"Test2"
}
}');
--Starting with v2016 there is JSON support
SELECT JSON_VALUE(t.YourJson, '$.User.EmailAddress')
FROM #tbl t
--String-methods
--use CHARINDEX AND SUBSTRING
DECLARE #FirstBorder NVARCHAR(MAX)='"EMailAddress":';
DECLARE #SecondBorder NVARCHAR(MAX)='",';
SELECT t.*
,A.Pos1
,B.Pos2
,SUBSTRING(t.YourJson,A.Pos1,B.Pos2 - A.Pos1) AS ExtractedEMail
FROM #tbl t
OUTER APPLY(SELECT CHARINDEX(#FirstBorder,t.YourJson)+LEN(#FirstBorder)) A(Pos1)
OUTER APPLY(SELECT CHARINDEX(#SecondBorder,t.YourJson,A.Pos1)) B(Pos2);
--use a XML trick
SELECT CAST('<x>' + REPLACE(REPLACE((SELECT t.YourJson AS [*] FOR XML PATH('')),'"EmailAddress":','<mailAddress value='),',',' />') + '</x>' AS XML)
.value('(/x/mailAddress/#value)[1]','nvarchar(max)')
FROM #tbl t
Some explanations:
JSON-support will parse the value directly from a JSON path.
For CHARINDEX AND SUBSTRING I use APPLY. The advantage is, that you can use the computed positions like a variable. No need to repeat the CHARINDEX statements over and over.
The XML approach will transform your JSON to a rather strange and ugly XML. The only sensefull element is <mailAddress> with an attribute value. We can use the native XML method .value() to retrieve the value you are asking for:
An intermediate XML looks like this:
<x>{ "Type":"test1" />
"Admin":"test1" />
"User":{
"UserID":"16959191" />
"FirstName":"Test1" />
"Surname":"Test1a" />
<mailAddress value="Test1.Test1#test.com" />
"Address":"Test1"
}
}</x>

Related

Update UDF names stored in table to add parameter value

I have thousands of UDF names stored in table and executed dynamically where it is required. The problem is I have added one new parameter unit to the function dbo.GetStockPrice(6544,1) so I need to send one more parameter value for now 1 bue it can be any and the data should be changed to dbo.GetStockPrice(6544,1,1) for all the rows where dbo.GetStockPrice is exist. So I am seeking for the query to update these all at once.
Sample Data
DECLARE #table AS TABLE(id INT, UDF VARCHAR(1000))
INSERT INTO #table VALUES
(7774,'dbo.GetStockPrice(1211,1)*dbo.GetStockPrice(1211,1)'),
(7775,'dbo.GetStockPrice(232,1)'),
(7778,'dbo.GetStockPrice(6456,1)'),
(7780,'dbo.GetStockPrice(34,1)'),
(7784,'dbo.FNACondition(dbo.FNAMargin(1,NULL,0), 0, dbo.GetStockPrice(654,1)+1)'),
(7786,'dbo.GetStockPrice(9876,1)'),
(7906,'dbo.GetStockPrice(5565,1)'),
(7911,'dbo.GetStockPrice(7886,1)'),
(7912,'dbo.GetStockPrice(87,1)'),
(8403,'dbo.PriceValue(479,NULL,NULL)*dbo.GetStockPrice(6544,1)+dbo.FNAMargin(1,NULL,0)')
Expected Output:
7774 dbo.GetStockPrice(1211,1,1)*dbo.GetStockPrice(1211,1,1)
7775 dbo.GetStockPrice(232,1,1)
so on......
I am still trying with REPLACE, SUBSTRING but unable to come out with any solution. Getting difficulties with it's different length and position in the row.
Seeking Help !! Thank you in Advance :)
Try it with this approach:
DECLARE #table AS TABLE(id INT, UDF VARCHAR(1000))
INSERT INTO #table VALUES
(7774,'dbo.GetStockPrice(1211,1)*dbo.GetStockPrice(1211,1)'),
(7775,'dbo.GetStockPrice(232,1)'),
(7778,'dbo.GetStockPrice(6456,1)'),
(7780,'dbo.GetStockPrice(34,1)'),
(7784,'dbo.FNACondition(dbo.FNAMargin(1,NULL,0), 0, dbo.GetStockPrice(654,1)+1)'),
(7786,'dbo.GetStockPrice(9876,1)'),
(7906,'dbo.GetStockPrice(5565,1)'),
(7911,'dbo.GetStockPrice(7886,1)'),
(7912,'dbo.GetStockPrice(87,1)'),
(7913,'dbo.Blah(87,1)'),
(8403,'dbo.PriceValue(479,NULL,NULL)*dbo.GetStockPrice(6544,1)+dbo.FNAMargin(1,NULL,0)');
--The query
WITH SplitToParts AS
(
SELECT t.*
,A.parted
,ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS PartNr
,B.part.value('text()[1]','nvarchar(max)') AS Part
FROM #table AS t
CROSS APPLY(SELECT CAST('<x>' + REPLACE((SELECT t.UDF AS [*] FOR XML PATH('')),'dbo.GetStockPrice(','</x><x>$$$FoundIt$$$</x><x>') + '</x>' AS XML)) AS A(parted)
CROSS APPLY parted.nodes(N'/x') AS B(part)
WHERE UDF LIKE '%dbo.GetStockPrice(%'
)
,modified AS
(
SELECT *
,CASE WHEN LAG(stp.Part) OVER(PARTITION BY id ORDER BY PartNr)='$$$FoundIt$$$' THEN STUFF(stp.Part,CHARINDEX(')',stp.Part),0,',1') ELSE stp.Part END AS Added
FROM SplitToParts AS stp
)
SELECT t2.*
,(
SELECT REPLACE(m.added,'$$$FoundIt$$$','dbo.GetStockPrice(')
FROM modified AS m
WHERE m.id=t2.id
ORDER BY m.PartNr
FOR XML PATH(''),TYPE
).value('.','nvarchar(max)')
FROM #table AS t2
WHERE UDF LIKE '%dbo.GetStockPrice(%';
The result
7774 dbo.GetStockPrice(1211,1,1)*dbo.GetStockPrice(1211,1,1)
7775 dbo.GetStockPrice(232,1,1)
7778 dbo.GetStockPrice(6456,1,1)
7780 dbo.GetStockPrice(34,1,1)
7784 dbo.FNACondition(dbo.FNAMargin(1,NULL,0), 0, dbo.GetStockPrice(654,1,1)+1)
7786 dbo.GetStockPrice(9876,1,1)
7906 dbo.GetStockPrice(5565,1,1)
7911 dbo.GetStockPrice(7886,1,1)
7912 dbo.GetStockPrice(87,1,1)
8403 dbo.PriceValue(479,NULL,NULL)*dbo.GetStockPrice(6544,1,1)+dbo.FNAMargin(1,NULL,0)
Some explanation: The string will be cut in parts using your function's name as splitter. Gladfully you tagged this with [sql-server-2012] so you can use LAG(). This will test the previous element, if it is $$$FoundIt$$$. In this case the first closing bracket will get an additional ,1. The rest is reconcatenation.
Attention: If your call might include a computed value such as
dbo.GetStockPrice(1211,(1+2))
or
dbo.GetStockPrice(dbo.SomeOtherFunc(1),1)
...the first closing bracket is the wrong place to insert the ,1. But this would get really tricky... You'd have to run through it, char by char, and count the opening brackets to find the related closing one.

OpenJson using a wildcard

I have a SQL query using OPENJSON to import JSON data into a table. My problem is that the data I need is nested. How can I use a wildcard in the JSON path to get what I need?
SELECT #Set =
BulkColumn FROM OPENROWSET
(BULK 'Sets.json', DATA_SOURCE = 'MyAzureJson', SINGLE_BLOB) JSON;
INSERT INTO [Sets]
SELECT [name]
FROM OPENJSON(#Set)
WITH(
[name] nvarchar(50) '$.*.name'
)
my json file is set up like this..
{
"testOne" : {
name: "nameOne"
},
"testTwo : {
name: "nameTwo"
}
}
the error I'm getting with everything I try..
JSON path is not properly formatted. Unexpected character '*' is found at position 2.
I've tried . * [] and nothing works
As far as I know there is no support for wildcards in OPENJSON.
Instead you can do a workaround by ignoring the field name in your search. Use JSON_VALUE for this.
INSERT INTO [Sets]
SELECT
JSON_VALUE([value], '$.name')
FROM
OPENJSON(#Set)
Explanation: If you don't define the variables of OPENJSON inside a WITH clause and instead do a simple SELECT * FROM OPENJSON(#Set) query, you will get a result with key, value and type columns (see example output below). Because key contains your problematic field name, you can ignore that part and just look into the value column of the data.
[key] [value] [type]
----- ------- ------
testOne { name: "nameOne" } 5
testTwo { name: "nameTwo" } 5

OpemXML - Read last node in SQL

I have a XML like this
<Cat>
<Inner>
<PATCat>
<Pat>SUR</Pat>
<EfDa>20170411093000</EfDa>
</PATCat>
<PATCat>
<Pat>MH</Pat>
<EfDa>20170411094100</EfDa>
</PATCat>
<PATCat>
<Pat>NRO</Pat>
<EfDa>20170411095300</EfDa>
</PATCat>
<PATCat>
<Pat>DAY</Pat>
<EfDa>20170411110900</EfDa>
</PATCat>
</Inner>
</Cat>
and I am using the Query to read the nodes Pat and EfDa
SELECT #PATCat_Pat = Pat,
#PATCat_EfDa = EfDa,
FROM OPENXML(#idoc, '/Cat/Inner', 2)
WITH (
FiCl VARCHAR(20) 'PATCat/Pat',
EfDa VARCHAR(20) 'PATCat/EfDa',
)
The result is #PATCat_Pat = SUR and #PATCat_EfDa = 20170411093000, Whereas I want to read the last node which is "DAY" and "20170411110900"
How can I achieve this? any help would be appreciated
Thanks
You should use value() and last() for xml type instead of OPENXML
DECLARE #xml XML = N'<Cat>
<Inner>
<PATCat>
<Pat>SUR</Pat>
<EfDa>20170411093000</EfDa>
</PATCat>
<PATCat>
<Pat>MH</Pat>
<EfDa>20170411094100</EfDa>
</PATCat>
<PATCat>
<Pat>NRO</Pat>
<EfDa>20170411095300</EfDa>
</PATCat>
<PATCat>
<Pat>DAY</Pat>
<EfDa>20170411110900</EfDa>
</PATCat>
</Inner>
</Cat>'
SELECT #xml.value('(Cat/Inner/PATCat[last()]/Pat)[1]', 'varchar(10)') AS PAT,
#xml.value('(Cat/Inner/PATCat[last()]/EfDa)[1]', 'varchar(30)') AS EfDa
Return
PAT EfDa <br/>
DAY 20170411110900
last() can be used with OPENXML as well.
SELECT Pat,
EfDa
FROM OPENXML(#idoc, '/Cat/Inner/PATCat[last()]', 2)
WITH (
Pat VARCHAR(20) 'Pat',
EfDa VARCHAR(20) 'EfDa'
);
FROM OPENXML with the corresponding SPs to prepare and to remove a document is outdated and should not be used any more (rare exceptions exist). Rather use the appropriate methods the XML data type provides.
From a comment I get, that your function gets the handle, so you'll have to stick with this...
In your question you write, that you want to read the last node which is "DAY" and "20170411110900".
What is your criterion? last or the one with <Pat>="DAY" or - if there might be more of the same - the last of all <PATCat>, which has <Pat>="DAY"? Are the elements always in the same order? Is the last <PATCat> always the one with <PAT>="DAY"?
You have got the solution with last() already. It will find the last <PATCat> no matter what's inside:
'/Cat/Inner/PATCat[last()]'
Looking for the one with "DAY" would be this
'/Cat/Inner/PATCat[(PAT/text())[1]="DAY"][1]'
If there might be more with "DAY" you could replace the last [1] with [last()]

Get multiple xml nodes (delimited)

I have a table with a xml that is formatted something like this (simplified for readability)
<parentItem xmlns:i="http://tempuri.org/1" xmlns="http://tempuri.org/2">
<ItemA></ItemA>
<ItemB></ItemB>
<ItemC xmlns:d2p1="http://tempuri.org/3">
<d2p1:string>value1</d2p1:string>
<d2p1:string>value2</d2p1:string>
<d2p1:string>value3</d2p1:string>
<!-- .... (0 to many strings here) -->
</ItemC>
</parentItem>
The only think I care about are the values in parentItem > ItemC > string
I would like to get those values delimited by something, such as a comma
Desired Result: "value1,value2,value3"
currently I can get one value by doing this:
SELECT CAST([QueryXml] as xml).value('(/*:parentItem/*:ItemC/node())[1]','nvarchar(max)')
FROM [opendb].[dbo].[MyTable]
Result: "value1"
I can also get all the values like this:
SELECT CAST([QueryXml] as xml).value('(/*:ConflictsSearchTermQuery/*:TermItems)[1]','nvarchar(max)')
FROM [opendb].[dbo].[ConflictsSearchTerms]
Result: "value1value2value3"
but I'm looking to get a delimited set of values
Desired Result: "value1,value2,value3"
To get multiple values out of XML you need to use the nodes() method of the XML data type.
However, since this method does not return a single, scalar value (but a rowset), you need to call it through CROSS APPLY.
WITH MyTable AS (
SELECT 1 AS ID, CAST('<parentItem xmlns:i="http://tempuri.org/1" xmlns="http://tempuri.org/2">
<ItemA></ItemA>
<ItemB></ItemB>
<ItemC xmlns:d2p1="http://tempuri.org/3">
<d2p1:string>value1</d2p1:string>
<d2p1:string>value2</d2p1:string>
<d2p1:string>value3</d2p1:string>
<!-- .... (0 to many strings here) -->
</ItemC>
</parentItem>' AS XML) AS QueryXml
)
SELECT
t.ID,
x.node.value('.', 'varchar(100)') AS nodeValue
FROM
MyTable t
CROSS APPLY QueryXml.nodes('
declare namespace i="http://tempuri.org/1";
declare namespace def="http://tempuri.org/2";
declare namespace d2p1="http://tempuri.org/3";
/def:parentItem/def:ItemC/d2p1:string'
) x(node)
gives you
ID nodeValue
----------- ------------------
1 value1
1 value2
1 value3
After that, if you really must, standard techniques for concatenating values in SQL Server apply.
Note that I have properly declared the namespaces in the XQuery instead of using *. Namespaces are important, don't ignore them.

Using xml with SQL to evaluate a dynamic variable

This is a unique problem..I think. So my goal is to input a variable and get a row from my column. Let me explain a little with the code im doing.
SELECT
pref.query('Database/text()') as PersonSkills,
pref.query('FillQuery/text()') as PersonSkills,
pref.query('TabText/text()') as PersonSkills,
pref.query('TooltipText/text()') as PersonSkills
FROM table CROSS APPLY
Tag.nodes('/Root/Configuration/TaskSelectorControl/QueueSelector') AS People(pref)
this works fine. However what I need to do is pass in the last part, the queue selector as a variables.
DECLARE #Xml XML
DECLARE #AttributeName VARCHAR(MAX) = 'QueueSelector'
SELECT
pref.query('Database/text()') as PersonSkills,
pref.query('FillQuery/text()') as PersonSkills,
pref.query('TabText/text()') as PersonSkills,
pref.query('TooltipText/text()') as PersonSkills
FROM table CROSS APPLY
Tag.nodes('/Root/Configuration/TaskSelectorControl[#Name=sql:variable("#AttributeName")]
') AS People(pref)
this doesnt work, any ideas why?
Well, I kinda lied. the bottom works, however it returns an empty dataset
/Root/Configuration/TaskSelectorControl/QueueSelector
is not equivalent to:
/Root/Configuration/TaskSelectorControl[#Name='QueueSelector']
The above XPath selects <TaskSelectorControl Name="QueueSelector">, not <QueueSelector> children of <TaskSelectorControl>.
You could either do this in XPath:
/Root/Configuration/TaskSelectorControl/*[local-name(.)=sql:variable("#AttributeName")]
Or it might be simpler to concat prior to evaluating:
'/Root/Configuration/TaskSelectorControl/' + #AttributeName