How to get element values from an XML column? - sql

I have the XML below in a column. I need to get to \Report\Criterias\Criteria (where name="Advertisers")\Elements\Element(where name="ListViewAvailable"). From here I need to list all the numbers that are in the Value element.
So far I got:
SELECT xmlColumn.query('/Report/Criterias/Criteria/Elements/Element')
from tbl
but no idea how to filter.
<Report>
<Criterias>
<Criteria name="Date Range">
...
</Criteria>
<Criteria name="Advertisers">
<Elements>
<Element name="CheckBoxOne">
<Value>0</Value>
</Element>
<Element name="ListViewAvailable">
<Value>314</Value>
<Value>57</Value>
<Value>18886</Value>
<Value>7437</Value>
</Element>
</Elements>
</Criteria>
<Criteria name="Revenue Types">
...
</Criteria>
</Criterias>
</Report>

You can filter using predicate ([]) in combination with CROSS APPLY to shred the XML on Value elements level :
SELECT C.value('.', 'int') AS Value
FROM tbl t
CROSS APPLY t.xmlColumn.nodes('
/Report/Criterias/Criteria[#name="Advertisers"]
/Elements/Element[#name="ListViewAvailable"]
/Value
') T(C)

Related

How to put an attribute on the root element, and only the root element, in FOR XML PATH?

I'm generating XML from a SQL Server table.
This is my code:
;WITH XMLNAMESPACES
(
'http://www.w3.org/2001/XMLSchema-instance' AS xsi
--,DEFAULT 'http://www.w3.org/2001/XMLSchema-instance' -- xmlns
)
SELECT
'T_Contracts' AS "#tableName",
(SELECT * FROM T_Contracts
FOR XML PATH('row'), TYPE, ELEMENTS xsinil)
FOR XML PATH('table'), TYPE, ELEMENTS xsinil
I want the result to look like this (note: attribute tableName on the root element):
<table xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" tableName="T_Contracts">
<row>
<VTR_UID>779FE899-4E81-4D8C-BF9B-3F17BC1DF146</VTR_UID>
<VTR_MDT_ID>0</VTR_MDT_ID>
<VTR_VTP_UID xsi:nil="true" />
<VTR_Nr>0050/132251</VTR_Nr>
</row>
</table>
But it duplicates the XSI namespace on the row element...
<table xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" tableName="T_Contracts">
<row xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<VTR_UID>779FE899-4E81-4D8C-BF9B-3F17BC1DF146</VTR_UID>
<VTR_MDT_ID>0</VTR_MDT_ID>
<VTR_VTP_UID xsi:nil="true" />
<VTR_Nr>0050/132251</VTR_Nr>
</row>
</table>
What's the correct way to add an attribute to the root element, and only the root element ?
Note
NULL-values must be returned as <columnName xsi:nil="true" /> and not be omitted.
(And no xml.modify after the select)
Please note that this is NOT a duplicate of an existing question.
This annoying behaviour of repeated namespaces with sub-queries was a reported issue for more than 10 years on MS-Connect with thousands of votes. This platform was dismissed, so was this issue and there is no perspective that MS will ever solve this.
Just to be fair: It is not wrong to repeat the namespace declaration. It's just bloating the string-based output...
Even stranger is the the unsupported attribute on a root level node...
Well, if you need a head-ache, you might look into OPTION EXPLICIT :-)
The accepted answer by Marc Guillot will not produce xsi:nil="true" attributes as you seem to need them. It will just wrap your result with the appropriate root node.
Finally: This cannot be solved with XML methods, you can try this:
Update: Found a way, see below...
DECLARE #tbl TABLE(ID INT,SomeValue INT);
INSERT INTO #tbl VALUES(1,1),(2,NULL);
SELECT CAST(REPLACE(CAST(
(
SELECT *
FROM #tbl
FOR XML PATH('row'),ROOT('table'),TYPE, ELEMENTS XSINIL
) AS nvarchar(MAX)),'<table ','<table tableName="T_Contracts" ') AS XML);
The result
<table xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" tableName="T_Contracts">
<row>
<ID>1</ID>
<SomeValue>1</SomeValue>
</row>
<row>
<ID>2</ID>
<SomeValue xsi:nil="true" />
</row>
</table>
The idea in short:
We create the XML without a sub-query and add the attribute with a string method into the casted XML.
As the position of an attribute is not important, we can add it everywhere.
alternatively you might search for the first closing > and use STUFF() there...
UPDATE
Heureka, I just found a way, to create this without swithing to string, but it's clumsy :-)
DECLARE #tbl TABLE(ID INT,SomeValue INT);
INSERT INTO #tbl VALUES(1,1),(2,NULL);
SELECT
(
SELECT 'T_Contracts' AS [#tableName]
,(
SELECT 'SomeRowAttr' AS [#testAttr] --added this to test row-level attributes
,*
FROM #tbl
FOR XML PATH('row'),TYPE, ELEMENTS XSINIL
)
FOR XML PATH('table'),TYPE, ELEMENTS XSINIL
).query('<table xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">{/table/#*}
{
for $nd in /table/row
return
<row>{$nd/#*}
{
$nd/*
}
</row>
}
</table>');
The result
<table xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" tableName="T_Contracts">
<row testAttr="SomeRowAttr">
<ID>1</ID>
<SomeValue>1</SomeValue>
</row>
<row testAttr="SomeRowAttr">
<ID>2</ID>
<SomeValue xsi:nil="true" />
</row>
</table>
Why don't you build manually the root element ?
Example:
with CTE as (
select (select * from T_Contracts for xml path('row')) as MyXML
)
select '<table xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" tableName="T_Contracts">' +
MyXML +
'</table>'
from CTE
Unfortunately you cannot do this with the SQL Server out of the box nor exists an elegant way to do that. To alleviate the issue, you can replace NULLs with empty strings. This will remove xmlns, but you have to define your select list explicitly as follows. Moreover, this works only with character string data types as you cannot assign an empty string ('' in ISNULL function) to-for example-an integer.
;WITH XMLNAMESPACES
(
'http://www.w3.org/2001/XMLSchema-instance' AS xsi
--,DEFAULT 'http://www.w3.org/2001/XMLSchema-instance' -- xmlns
)
SELECT 'T_Contracts' AS "#tableName",
(
SELECT
ISNULL(VTR_UID, '') 'row/VTR_UID'
,ISNULL(VTR_MDT_ID, '') 'row/VTR_MDT_ID'
,ISNULL(VTR_VTP_UID, '') 'row/VTR_VTP_UID'
,ISNULL(VTR_Nr, '') 'row/VTR_Nr'
FROM T_Contracts
FOR XML PATH(''), TYPE
)
FOR XML PATH('table'), TYPE, ELEMENTS xsinil
The result will be like below:
<table xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" tableName="T_Contracts">
<row>
<VTR_UID>779FE899-4E81-4D8C-BF9B-3F17BC1DF146</VTR_UID>
<VTR_MDT_ID>0</VTR_MDT_ID>
<VTR_VTP_UID />
<VTR_Nr>0050/132251</VTR_Nr>
</row>
</table>

Stripping data from xml in SQL Server

One of my tables with xml datatype has the following xml information:
<RequestMetaData xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<MetaData Type="DocImport">
<Keywords>
<Key Name="Zone" Value="MIO" />
<Key Name="ClassificationStrategy" Value="NeedClassification" />
<Key Name="Folder" Value="0456e6ca" />
</Keywords>
</MetaData>
<MetaData Type="SourceResponse">
<Keywords>
<Key Name="NotificationResponse_20180427-150426" Value="Received successful response from Source" />
</Keywords>
</MetaData>
</RequestMetaData>
I need to write an SQL query to fetch the value of Classification strategy based on key name.
I have added the xml in a variable #xml and used the following code. It is returning NULL.
select A.b.value('ClassificationStrategy[1]', 'VARCHAR(30)') AS CS
FROM #xml.nodes('/RequestMetaData/MetaData/Keywords') AS A(b)
Can someone please help me with this.
You can read your XML in various ways. Use a simple .value() with an XPath/XQuery expression to retrieve a single value, use .query to retrieve a part of the XML or use .nodes() to return repeated elements as derived table:
DECLARE #xml XML=
N'<RequestMetaData xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<MetaData Type="DocImport">
<Keywords>
<Key Name="Zone" Value="MIO" />
<Key Name="ClassificationStrategy" Value="NeedClassification" />
<Key Name="Folder" Value="0456e6ca" />
</Keywords>
</MetaData>
<MetaData Type="SourceResponse">
<Keywords>
<Key Name="NotificationResponse_20180427-150426" Value="Received successful response from Source" />
</Keywords>
</MetaData>
</RequestMetaData>';
--Read the whole lot
SELECT md.value('#Type','nvarchar(max)') AS MetaDataType
,k.value('#Name','nvarchar(max)') AS KeyName
,k.value('#Value','nvarchar(max)') AS KeyValue
FROM #xml.nodes('/RequestMetaData/MetaData') A(md)
OUTER APPLY md.nodes('Keywords/Key') B(k);
--Get one key's value by name (anywhere in the doc)
DECLARE #keyName VARCHAR(100)='ClassificationStrategy';
SELECT #xml.value('(//Key[#Name=sql:variable("#keyName")]/#Value)[1]','nvarchar(max)');
--Use the meta data type as additional filter (if key names are not unique per doc)
DECLARE #kName VARCHAR(100)='ClassificationStrategy';
DECLARE #mdType VARCHAR(100)='DocImport';
SELECT #xml.value('(/RequestMetaData
/MetaData[#Type=sql:variable("#mdType")]
/Keywords
/Key[#Name=sql:variable("#kName")]
/#Value)[1]','nvarchar(max)');

TSQL XML Parsing and creating xml

I have a tool which I now will be creating reports for using the data I have. I am currently working on a year to date report and need to pull the numbers for that.
My goal is to have an XML output of each of the months in the current year with their totals.
Here is what the XML currently looks like with my select statement:
<root>
<data>
<classXML>
<courses>
<class>
<classTitle>Arts and Crafts</classTitle>
<tuitionCost>100</tuitionCost>
<bookCost>30</bookCost>
<classTotal>130</classTotal>
</class>
<class>
<classTitle>Paper 101</classTitle>
<tuitionCost>320</tuitionCost>
<bookCost>211</bookCost>
<classTotal>531</classTotal>
</class>
<class>
<classTitle>Introduction to Pencils</classTitle>
<tuitionCost>210</tuitionCost>
<bookCost>291</bookCost>
<classTotal>501</classTotal>
</class>
<class>
<classTitle>Intermediate Folding</classTitle>
<tuitionCost>110</tuitionCost>
<bookCost>22</bookCost>
<classTotal>132</classTotal>
</class>
<class>
<classTitle>Advanced Jumprope</classTitle>
<tuitionCost>11</tuitionCost>
<bookCost>22</bookCost>
<classTotal>33</classTotal>
</class>
<grandTotal>1327</grandTotal>
</courses>
</classXML>
<reimbursementDate>08/01/2014</reimbursementDate>
</data>
<data>
<classXML>
<courses>
<class>
<classTitle>dsfgfdsg</classTitle>
<tuitionCost>44</tuitionCost>
<bookCost>44</bookCost>
<classTotal>88</classTotal>
</class>
<grandTotal>88</grandTotal>
</courses>
</classXML>
<reimbursementDate>05/31/2014</reimbursementDate>
</data>
</root>
And my stored procedure:
SELECT
A.[classXML],
CONVERT(VARCHAR(10), A.[reimbursementDate], 101) as reimbursementDate
FROM
tuitionSubmissions as A
WHERE
A.[status] = 'Approved'
AND YEAR(A.[reimbursementDate]) = YEAR(GETDATE())
FOR XML PATH ('data'), TYPE, ELEMENTS, ROOT ('root');
As you can see, the column classXML stores that data in XML format with all of the classes they are enrolled in with their costs.
So I need to loop over the XML and create an output that is just numbers to assist with my reporting.
Here is my desired outcome:
<results>
<dataSet>
<month>8</month>
<year>2014</year>
<tuitionTotal>500</tuitionTotal>
<booksTotal>200</booksTotal>
<grandTotal>700</grandTotal>
</dataSet>
<dataSet>
<month>9</month>
<year>2014</year>
<tuitionTotal>100</tuitionTotal>
<booksTotal>500</booksTotal>
<grandTotal>600</grandTotal>
</dataSet>
</results>
You can use sum Function (XQuery) to do the aggregation against your XML column.
I put the query against the XML in a cross apply so you don't have to do the same XQuery twice just to calculate grandTotal.
You should also change your predicate against reimbursementDate so it may use and index to find the rows.
select datepart(month, T.reimbursementDate) as month,
datepart(year, T.reimbursementDate) as year,
S.tuitionTotal,
S.booksTotal,
S.tuitionTotal + S.booksTotal as grandTotal
from dbo.tuitionSubmissions as T
cross apply (
select T.classXML.value('sum(/courses/class/tuitionCost/text())', 'int') as tuitionTotal,
T.classXML.value('sum(/courses/class/bookCost/text())', 'int') as booksTotal
) as S
where T.status = 'Approved' and
T.reimbursementDate >= '20140101' and
T.reimbursementDate < '20150101'
for xml path('dataSet'), root('results'), type
SQL Fiddle
DECLARE #DocH INT
DECLARE #DOC XML = '
<root>
<data>
<classXML>
<courses>
<class>
<classTitle>Arts and Crafts</classTitle>
<tuitionCost>100</tuitionCost>
<bookCost>30</bookCost>
<classTotal>130</classTotal>
</class>
<class>
<classTitle>Paper 101</classTitle>
<tuitionCost>320</tuitionCost>
<bookCost>211</bookCost>
<classTotal>531</classTotal>
</class>
<class>
<classTitle>Introduction to Pencils</classTitle>
<tuitionCost>210</tuitionCost>
<bookCost>291</bookCost>
<classTotal>501</classTotal>
</class>
<class>
<classTitle>Intermediate Folding</classTitle>
<tuitionCost>110</tuitionCost>
<bookCost>22</bookCost>
<classTotal>132</classTotal>
</class>
<class>
<classTitle>Advanced Jumprope</classTitle>
<tuitionCost>11</tuitionCost>
<bookCost>22</bookCost>
<classTotal>33</classTotal>
</class>
<grandTotal>1327</grandTotal>
</courses>
</classXML>
<reimbursementDate>08/01/2014</reimbursementDate>
</data>
<data>
<classXML>
<courses>
<class>
<classTitle>dsfgfdsg</classTitle>
<tuitionCost>44</tuitionCost>
<bookCost>44</bookCost>
<classTotal>88</classTotal>
</class>
<grandTotal>88</grandTotal>
</courses>
</classXML>
<reimbursementDate>05/31/2014</reimbursementDate>
</data>
</root>'
EXEC sp_xml_preparedocument #DocH OUTPUT, #DOC
SELECT
MONTH(reimbursementDate) AS month
, YEAR(reimbursementDate) AS year
, SUM(tuitionCost) AS tuitionTotal, SUM(bookCost) AS bookTotal, SUM(tuitionCost+bookCost) AS grandTotal
FROM OPENXML(#DocH,'/root/data/classXML/courses/class') WITH (
classTitle varchar(40) 'classTitle'
, tuitionCost INT 'tuitionCost'
, bookCost INT 'bookCost'
, reimbursementDate date '../../../reimbursementDate'
)
GROUP BY MONTH(reimbursementDate)
, YEAR(reimbursementDate)
FOR XML PATH ('dataset')
EXEC sp_xml_removedocument #DocH;

how to update or query xml with xmlns attributes

say my xml doc is this
<root xmlns="http://www.w3.org/2001/XMLSchema-instance">
<parent prop="1">
<child>
<field name="1">
<value1>abc</value1>
<value2>cdf</value2>
</field>
<field name="2">
<value1>efg</value1>
<value2>hjk</value2>
</field>
</child>
</parent>
<parent2>
<prop atrb="2">abc</prop>
</parent2>
</root>
i have it a table newTable2 and xml datatyped column as xmlcol1
here is the query i worte
SELECT xmlcol1.query('/root/parent/child/field/value1/text()') AS a
FROM newTable2
this works when i remove the xmlns attribute if i put it back it does can anyone explain why is it so and how can i query for same keeping the xmlns attribute.
Try this:
;with xmlnamespaces (
default 'http://www.w3.org/2001/XMLSchema-instance'
)
SELECT xmlcol1.query('/root/parent/child/field/value1/text()') AS a_query
, xmlcol1.value('(/root/parent/child/field/value1/text())[1]', 'varchar(255)') AS a_value_1
, xmlcol1.value('(/root/parent/child/field/value1/text())[2]', 'varchar(255)') AS a_value_2
FROM newTable2
never mind i found the answer
i just need to use the
;WITH XMLNAMESPACES(DEFAULT 'http://www.w3.org/2001/XMLSchema-instance')
before the query

How do I set the xmlns attribute on the root element in the generated XML by using T-SQL's xml data type method: query?

I've created a simplified version of my problem:
DECLARE #X XML =
'<Root xmlns="TestNS" xmlns:i="http://www.w3.org/2001/XMLSchema-instance">
<Test>
<Id>1</Id>
<InnerCollection>
<InnerItem>
<Value>1</Value>
</InnerItem>
<InnerItem>
<Value>2</Value>
</InnerItem>
<InnerItem>
<Value>3</Value>
</InnerItem>
</InnerCollection>
</Test>
<Test>
<Id>2</Id>
<InnerCollection>
<InnerItem>
<Value>5</Value>
</InnerItem>
<InnerItem>
<Value>6</Value>
</InnerItem>
<InnerItem>
<Value>7</Value>
</InnerItem>
</InnerCollection>
</Test>
</Root>'
I'm trying to write a query that takes each <Test> element and breaks it into a row. On each row I want to select the Id and the InnerCollection as XML. I want to create this InnerCollection XML for the first row (Id:1):
<InnerCollection xmlns="Reed.Api" xmlnsi="http//www.w3.org/2001/XMLSchema-instance">
<InnerItem>
<Value>1</Value>
</InnerItem>
<InnerItem>
<Value>2</Value>
</InnerItem>
<InnerItem>
<Value>3</Value>
</InnerItem>
</InnerCollection>
I tried doing that with this query but it puts a namespace I don't want on the elements:
;WITH XMLNAMESPACES
(
DEFAULT 'TestNS'
, 'http://www.w3.org/2001/XMLSchema-instance' AS i
)
SELECT
X.value('Id[1]', 'INT') Id
-- Creates a p1 namespace that I don't want.
, X.query('InnerCollection') InnerCollection
FROM #X.nodes('//Test') AS T(X)
My Google-fu isn't very strong today, but I imagine it doesn't make it any easier that the darn function is called query. I'm open to using other methods to create that XML value other than the query method.
I could use this method:
;WITH XMLNAMESPACES
(
DEFAULT 'TestNS'
, 'http://www.w3.org/2001/XMLSchema-instance' AS i
)
SELECT
X.value('Id[1]', 'INT') Id
,CAST(
(SELECT
InnerNodes.Node.value('Value[1]', 'INT') AS 'Value'
FROM X.nodes('./InnerCollection[1]//InnerItem') AS InnerNodes(Node)
FOR XML PATH('InnerItem'), ROOT('InnerCollection')
) AS XML) AS InnerCollection
FROM #X.nodes('//Test') AS T(X)
But that involves calling nodes on it to break it out into something selectable, and then selecting it back into XML using FOR XML... when it was XML to begin with. This seems like a inefficient method of doing this, so I'm hoping someone here will have a better idea.
This is how to do the SELECT using the query method to create the XML on each row that my question was looking for:
;WITH XMLNAMESPACES
(
'http://www.w3.org/2001/XMLSchema-instance' AS i
, DEFAULT 'TestNS'
)
SELECT
Test.Row.value('Id[1]', 'INT') Id
, Test.Row.query('<InnerCollection xmlns="TestNS" xmlns:i="http://www.w3.org/2001/XMLSchema-instance">{InnerCollection}</InnerCollection>')
FROM #X.nodes('/Root/Test') AS Test(Row)