OpemXML - Read last node in SQL - sql

I have a XML like this
<Cat>
<Inner>
<PATCat>
<Pat>SUR</Pat>
<EfDa>20170411093000</EfDa>
</PATCat>
<PATCat>
<Pat>MH</Pat>
<EfDa>20170411094100</EfDa>
</PATCat>
<PATCat>
<Pat>NRO</Pat>
<EfDa>20170411095300</EfDa>
</PATCat>
<PATCat>
<Pat>DAY</Pat>
<EfDa>20170411110900</EfDa>
</PATCat>
</Inner>
</Cat>
and I am using the Query to read the nodes Pat and EfDa
SELECT #PATCat_Pat = Pat,
#PATCat_EfDa = EfDa,
FROM OPENXML(#idoc, '/Cat/Inner', 2)
WITH (
FiCl VARCHAR(20) 'PATCat/Pat',
EfDa VARCHAR(20) 'PATCat/EfDa',
)
The result is #PATCat_Pat = SUR and #PATCat_EfDa = 20170411093000, Whereas I want to read the last node which is "DAY" and "20170411110900"
How can I achieve this? any help would be appreciated
Thanks

You should use value() and last() for xml type instead of OPENXML
DECLARE #xml XML = N'<Cat>
<Inner>
<PATCat>
<Pat>SUR</Pat>
<EfDa>20170411093000</EfDa>
</PATCat>
<PATCat>
<Pat>MH</Pat>
<EfDa>20170411094100</EfDa>
</PATCat>
<PATCat>
<Pat>NRO</Pat>
<EfDa>20170411095300</EfDa>
</PATCat>
<PATCat>
<Pat>DAY</Pat>
<EfDa>20170411110900</EfDa>
</PATCat>
</Inner>
</Cat>'
SELECT #xml.value('(Cat/Inner/PATCat[last()]/Pat)[1]', 'varchar(10)') AS PAT,
#xml.value('(Cat/Inner/PATCat[last()]/EfDa)[1]', 'varchar(30)') AS EfDa
Return
PAT EfDa <br/>
DAY 20170411110900

last() can be used with OPENXML as well.
SELECT Pat,
EfDa
FROM OPENXML(#idoc, '/Cat/Inner/PATCat[last()]', 2)
WITH (
Pat VARCHAR(20) 'Pat',
EfDa VARCHAR(20) 'EfDa'
);

FROM OPENXML with the corresponding SPs to prepare and to remove a document is outdated and should not be used any more (rare exceptions exist). Rather use the appropriate methods the XML data type provides.
From a comment I get, that your function gets the handle, so you'll have to stick with this...
In your question you write, that you want to read the last node which is "DAY" and "20170411110900".
What is your criterion? last or the one with <Pat>="DAY" or - if there might be more of the same - the last of all <PATCat>, which has <Pat>="DAY"? Are the elements always in the same order? Is the last <PATCat> always the one with <PAT>="DAY"?
You have got the solution with last() already. It will find the last <PATCat> no matter what's inside:
'/Cat/Inner/PATCat[last()]'
Looking for the one with "DAY" would be this
'/Cat/Inner/PATCat[(PAT/text())[1]="DAY"][1]'
If there might be more with "DAY" you could replace the last [1] with [last()]

Related

Extracting specific value from large string SQL

I've used a combination of CHARINDEX and SUBSTRING but can't get it working.
I get passed a variable in SQL that contains a lot of text but has an email in it. I need to extract the email value.
I have to use SQL 2008.
I'm trying to extract the value between "EmailAddress":" and ",
An example string is here:
{ "Type":test,
"Admin":test,
"User":{
"UserID":"16959191",
"FirstName":"Test",
"Surname":"Testa",
"EmailAddress":"Test.Test#test.com",
"Address":"Test"
}
}
Assuming you can't upgrade to 2016 or higher, you can use a combination of substring and charindex.
I've used a common table expression to make it less cumbersome, but you don't have to.
DECLARE #json varchar(4000) = '{ "Type":test,
"Admin":test,
"User":{
"UserID":"16959191",
"FirstName":"Test",
"Surname":"Testa",
"EmailAddress":"Test.Test#test.com",
"Address":"Test"
}
}';
WITH CTE AS
(
SELECT #Json as Json,
CHARINDEX('"EmailAddress":', #json) + LEN('"EmailAddress":') As StartIndex
)
SELECT SUBSTRING(Json, StartIndex, CHARINDEX(',', json, StartIndex) - StartIndex)
FROM CTE
Result: "Test.Test#test.com"
The first hint is: Move to v2016 if possible to use JSON support natively. v2008 is absolutely outdated...
The second hint is: Any string action (and all my approaches below will need some string actions too), will suffer from forbidden characters, unexpected blanks or any other surprise you might find within your data.
Try it like this:
First I create a mockup scenario to simulate your issue
DECLARE #tbl TABLE(ID INT IDENTITY,YourJson NVARCHAR(MAX));
INSERT INTO #tbl VALUES
(N'{ "Type":"test1",
"Admin":"test1",
"User":{
"UserID":"16959191",
"FirstName":"Test1",
"Surname":"Test1a",
"EmailAddress":"Test1.Test1#test.com",
"Address":"Test1"
}
}')
,(N'{ "Type":"test2",
"Admin":"test2",
"User":{
"UserID":"16959191",
"FirstName":"Test2",
"Surname":"Test2a",
"EmailAddress":"Test2.Test2#test.com",
"Address":"Test2"
}
}');
--Starting with v2016 there is JSON support
SELECT JSON_VALUE(t.YourJson, '$.User.EmailAddress')
FROM #tbl t
--String-methods
--use CHARINDEX AND SUBSTRING
DECLARE #FirstBorder NVARCHAR(MAX)='"EMailAddress":';
DECLARE #SecondBorder NVARCHAR(MAX)='",';
SELECT t.*
,A.Pos1
,B.Pos2
,SUBSTRING(t.YourJson,A.Pos1,B.Pos2 - A.Pos1) AS ExtractedEMail
FROM #tbl t
OUTER APPLY(SELECT CHARINDEX(#FirstBorder,t.YourJson)+LEN(#FirstBorder)) A(Pos1)
OUTER APPLY(SELECT CHARINDEX(#SecondBorder,t.YourJson,A.Pos1)) B(Pos2);
--use a XML trick
SELECT CAST('<x>' + REPLACE(REPLACE((SELECT t.YourJson AS [*] FOR XML PATH('')),'"EmailAddress":','<mailAddress value='),',',' />') + '</x>' AS XML)
.value('(/x/mailAddress/#value)[1]','nvarchar(max)')
FROM #tbl t
Some explanations:
JSON-support will parse the value directly from a JSON path.
For CHARINDEX AND SUBSTRING I use APPLY. The advantage is, that you can use the computed positions like a variable. No need to repeat the CHARINDEX statements over and over.
The XML approach will transform your JSON to a rather strange and ugly XML. The only sensefull element is <mailAddress> with an attribute value. We can use the native XML method .value() to retrieve the value you are asking for:
An intermediate XML looks like this:
<x>{ "Type":"test1" />
"Admin":"test1" />
"User":{
"UserID":"16959191" />
"FirstName":"Test1" />
"Surname":"Test1a" />
<mailAddress value="Test1.Test1#test.com" />
"Address":"Test1"
}
}</x>

Searching through XML in T-SQL with conditions

I am trying to get the correct info from an XML data type into regular scalar variables based on conditions, however I am having trouble getting the correct info back.
Here is the XML I am searching through:
<Loop2420>
<NM1>
<F98_1>PW</F98_1>
<F1065>2</F1065>
</NM1>
<N3>
<F166>81715 DOCTOR CARRE</F166>
</N3>
<N4>
<F19>INDIO</F19>
<F156>CA</F156>
<F116>92201</F116>
</N4>
</Loop2420>
<Loop2420>
<NM1>
<F98_1>45</F98_1>
<F1065>2</F1065>
</NM1>
<N3>
<F166>51250 MECCA AVE</F166>
</N3>
<N4>
<F19>COACHELLA</F19>
<F156>CA</F156>
<F116>92236</F116>
</N4>
</Loop2420>
Basically I need to get the numbers from <'F116'> but only if <'F98_1'> is equal to 'PW'.
I have tried:
declare #zip varchar(30)
select #zip = T.value('(F116)[1]','varchar(30)')
from #TransactionXML.nodes('/Loop2420/N4') Trans(T)
where T.value('(/Loop2420/NM1/F98_1)[1]','varchar(30)') = 'PW'
But that sometimes returns the value from <'F116'> even if <'F98_1'> is equal to '45'.
Any suggestions? Thanks.
Put the test in the XQuery itself and clamp it to the node you're checking:
SELECT #zip = T.value('(N4/F116)[1]', 'varchar(30)')
FROM #TransactionXML.nodes('/Loop2420') Trans(T)
WHERE T.exist('NM1/F98_1[text()="PW"]') = 1
If PW is not a static value, use the sql:variable() or sql:column() function to incorporate it in the query.

Oracle 10.2.0.4.0 query on partial xpath

I need to change the below query to be able to query any kind of tender item.
/Basket/CardTenderItem/Description
/Basket/CashTenderItem/Description
So
/Basket/WildcardTenderItem/Description
I have looked at various examples on but cannot them to bring back any results when running (happily admit to user error if can get working!)
SELECT
RETURN_ID
,SALE_ID,
,extractValue(xmltype(RETURNxml),'/Basket/CashTenderItem/NetValue')
,extractValue(xmltype(RETURNxml),'/Basket/CashTenderItem/Description')
FROM SPR361
WHERE return_id = '9999.0303|20170327224954|2063'
If you only want to match anything the ends with TenderItem, but doesn't have anything after that, you could be specific with substring checks:
SELECT
RETURN_ID
,SALE_ID
,extractValue(xmltype(RETURNxml),
'/Basket/*[substring(name(), string-length(name()) - 9) = "TenderItem"]/NetValue')
,extractValue(xmltype(RETURNxml),
'/Basket/*[substring(name(), string-length(name()) - 9) = "TenderItem"]/Description')
FROM SPR361
WHERE return_id = '9999.0303|20170327224954|2063'
If you never have any nodes with anything after that fixed string then #Shnugo's contains approach is easier, and in Oracle would be very similar:
...
,extractValue(xmltype(RETURNxml),
'/Basket/*[contains(name(), "TenderItem")]/NetValue')
,extractValue(xmltype(RETURNxml),
'/Basket/*[contains(name(), "TenderItem")]/Description')
I'm not sure there's any real difference between name() and local-name() here.
If a basket can have multiple child nodes (card and cash, or more than one of each) you could also switch to XMLTable syntax:
SELECT
s.RETURN_ID
,s.SALE_ID
,x.netvalue
,x.description
FROM SPR361 s
CROSS JOIN XMLTable(
'/Basket/*[contains(name(), "TenderItem")]'
PASSING XMLType(s.RETURNxml)
COLUMNS netvalue NUMBER PATH './NetValue'
, description VARCHAR(80) PATh './Description'
) x
WHERE s.return_id = '9999.0303|20170327224954|2063'
And it's overkill here maybe, but for more complicated tests you can use other XPath syntax, like:
CROSS JOIN XMLTable(
'for $i in /Basket/*
where contains($i/name(), "TenderItem") return $i'
PASSING XMLType(s.RETURNxml)
...
This is SQL-Server syntax and I cannot test, if this works with Oracle too, but I think it will. You can use XQuery function contains():
DECLARE #xml XML=
N'<root>
<abcTenderItem>test1</abcTenderItem>
<SomeOther>should not show up</SomeOther>
<xyzTenderItem>test2</xyzTenderItem>
</root>';
SELECT #xml.query(N'/root/*[contains(local-name(),"TenderItem")]')
only the elements with "TenderItem" in their names show up:
<abcTenderItem>test1</abcTenderItem>
<xyzTenderItem>test2</xyzTenderItem>

Using xml with SQL to evaluate a dynamic variable

This is a unique problem..I think. So my goal is to input a variable and get a row from my column. Let me explain a little with the code im doing.
SELECT
pref.query('Database/text()') as PersonSkills,
pref.query('FillQuery/text()') as PersonSkills,
pref.query('TabText/text()') as PersonSkills,
pref.query('TooltipText/text()') as PersonSkills
FROM table CROSS APPLY
Tag.nodes('/Root/Configuration/TaskSelectorControl/QueueSelector') AS People(pref)
this works fine. However what I need to do is pass in the last part, the queue selector as a variables.
DECLARE #Xml XML
DECLARE #AttributeName VARCHAR(MAX) = 'QueueSelector'
SELECT
pref.query('Database/text()') as PersonSkills,
pref.query('FillQuery/text()') as PersonSkills,
pref.query('TabText/text()') as PersonSkills,
pref.query('TooltipText/text()') as PersonSkills
FROM table CROSS APPLY
Tag.nodes('/Root/Configuration/TaskSelectorControl[#Name=sql:variable("#AttributeName")]
') AS People(pref)
this doesnt work, any ideas why?
Well, I kinda lied. the bottom works, however it returns an empty dataset
/Root/Configuration/TaskSelectorControl/QueueSelector
is not equivalent to:
/Root/Configuration/TaskSelectorControl[#Name='QueueSelector']
The above XPath selects <TaskSelectorControl Name="QueueSelector">, not <QueueSelector> children of <TaskSelectorControl>.
You could either do this in XPath:
/Root/Configuration/TaskSelectorControl/*[local-name(.)=sql:variable("#AttributeName")]
Or it might be simpler to concat prior to evaluating:
'/Root/Configuration/TaskSelectorControl/' + #AttributeName

Better way in TSQL to search xml for a node that doesn't exist

We have a source XML file that has an address node, and each node is supposed to have a zip_code node beneath in order to validate. We received a file that failed the schema validation because at least one node was missing it's zip_code (there were several thousand addresses in the file).
We need to find the elements that do not have a zip code, so we can repair the file and send an audit report to the source.
--declare #x xml = bulkcolumn from openrowset(bulk 'x:\file.xml',single_blob) as s
declare #x xml = N'<addresses>
<address><external_address_id>1</external_address_id><zip_code>53207</zip_code></address>
<address><external_address_id>2</external_address_id></address>
</addresses>'
declare #t xml = (
select #x.query('for $a in .//address
return
if ($a/zip_code)
then <external_address_id />
else $a/external_address_id')
)
select x.AddressID.value('.', 'int') AddressID
from #t.nodes('./external_address_id') x(AddressID)
where x.AddressID.value('.', 'int') > 0
GO
Really, it's the where clause that bugs me. I feel like I'm depending on a cast for a null value to 0, and it works, but I'm not really sure that it should. I tried a few variations with the .exist function, but I couldn't get the correct result.
If you just want to ensure that you are selecting address elements that have a zip_code element, then adjust your XPATH to include that criteria in a predicate filter:
/addresses/address[zip_code]
If you also want to ensure that the zip_code element also has a value, use a predicate filter for the zip_node to select those that have text() nodes:
/addresses/address[zip_code[text()]]
EDIT:
Actually, I'm looking for the
opposite. I need to identify the nodes
that don't have a zip, so we can
manually correct the source data.
So, if you want to identify all of the address elements that do not have a zip_code, you can specify it in the XPATH like this:
/addresses/address[not(zip_code)]
If you just want to locate those nodes that are missing their <zip_code> element, you could use something like this:
SELECT
ADRS.ADR.value('(external_address_id)[1]', 'int') as 'ExtAdrID'
FROM
#x.nodes('/addresses/address') as ADRS(ADR)
WHERE
ADRS.ADR.exist('zip_code') = 0
It uses the built-in .exist() method in XQuery to check the existence of a subnode inside an XML node.