Stripping data from xml in SQL Server - sql

One of my tables with xml datatype has the following xml information:
<RequestMetaData xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<MetaData Type="DocImport">
<Keywords>
<Key Name="Zone" Value="MIO" />
<Key Name="ClassificationStrategy" Value="NeedClassification" />
<Key Name="Folder" Value="0456e6ca" />
</Keywords>
</MetaData>
<MetaData Type="SourceResponse">
<Keywords>
<Key Name="NotificationResponse_20180427-150426" Value="Received successful response from Source" />
</Keywords>
</MetaData>
</RequestMetaData>
I need to write an SQL query to fetch the value of Classification strategy based on key name.
I have added the xml in a variable #xml and used the following code. It is returning NULL.
select A.b.value('ClassificationStrategy[1]', 'VARCHAR(30)') AS CS
FROM #xml.nodes('/RequestMetaData/MetaData/Keywords') AS A(b)
Can someone please help me with this.

You can read your XML in various ways. Use a simple .value() with an XPath/XQuery expression to retrieve a single value, use .query to retrieve a part of the XML or use .nodes() to return repeated elements as derived table:
DECLARE #xml XML=
N'<RequestMetaData xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<MetaData Type="DocImport">
<Keywords>
<Key Name="Zone" Value="MIO" />
<Key Name="ClassificationStrategy" Value="NeedClassification" />
<Key Name="Folder" Value="0456e6ca" />
</Keywords>
</MetaData>
<MetaData Type="SourceResponse">
<Keywords>
<Key Name="NotificationResponse_20180427-150426" Value="Received successful response from Source" />
</Keywords>
</MetaData>
</RequestMetaData>';
--Read the whole lot
SELECT md.value('#Type','nvarchar(max)') AS MetaDataType
,k.value('#Name','nvarchar(max)') AS KeyName
,k.value('#Value','nvarchar(max)') AS KeyValue
FROM #xml.nodes('/RequestMetaData/MetaData') A(md)
OUTER APPLY md.nodes('Keywords/Key') B(k);
--Get one key's value by name (anywhere in the doc)
DECLARE #keyName VARCHAR(100)='ClassificationStrategy';
SELECT #xml.value('(//Key[#Name=sql:variable("#keyName")]/#Value)[1]','nvarchar(max)');
--Use the meta data type as additional filter (if key names are not unique per doc)
DECLARE #kName VARCHAR(100)='ClassificationStrategy';
DECLARE #mdType VARCHAR(100)='DocImport';
SELECT #xml.value('(/RequestMetaData
/MetaData[#Type=sql:variable("#mdType")]
/Keywords
/Key[#Name=sql:variable("#kName")]
/#Value)[1]','nvarchar(max)');

Related

How to extract attribute value from XML in SQL Server 2019 (v15)?

I would need to extract elements from this XML into a tabular form, but I can't seem to get my head around how this would work on SQL Server via something like XQuery.
I have all the data in a temporary table called "#1" and the XML itself lies in a field called "Message" in that temporary table. How can I extract the values "Test1" and "2,2 %" into separate fields called "W08003" and "W1A081", respectively? The attribute names and the schema will remain the same over time? I would also need to do this on a row by row basis for each XML in the current temporary table.
<Individual xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<Content>
<status xmlns:d3p1="http://www.uc.se/schemas/ucOrderReply/" xmlns="http://www.uc.se/schemas/ucOrderReply/" d3p1:result="ok" />
<uc xmlns="http://www.uc.se/schemas/ucOrderReply/">
<xmlReply>
<reports xmlns:d5p1="http://www.uc.se/schemas/ucOrderReply/" d5p1:lang="eng">
<d5p1:report d5p1:id="7605089247" d5p1:name="Test1 Test2" d5p1:styp="K39" d5p1:index="0">
<d5p1:group d5p1:id="W080" d5p1:index="0" d5p1:key="" d5p1:name="ID particulars">
<d5p1:term d5p1:id="W08001">9760508923</d5p1:term>
<d5p1:term d5p1:id="W08002">7605089277</d5p1:term>
<d5p1:term d5p1:id="W08003">Test1</d5p1:term>
<d5p1:term d5p1:id="W08004">Test2</d5p1:term>
</d5p1:group>
<d5p1:group d5p1:id="W1A0" d5p1:index="0" d5p1:key="" d5p1:name="UC RPB">
<d5p1:term d5p1:id="W1A003">000000000000000022</d5p1:term>
<d5p1:term d5p1:id="W1A081">2,2 %</d5p1:term>
<d5p1:term d5p1:id="W1A082">2,18839</d5p1:term>
</d5p1:group>
</d5p1:report>
</reports>
</xmlReply>
</uc>
</Content>
</Individual>
Current SQL code:
WITH XMLNAMESPACES('http://www.uc.se/schemas/ucOrderReply/' AS ns,'http://www.uc.se/schemas/ucOrderReply/' AS d5p1)
SELECT ok.*
,X.g.value('(#d5p1:id)','varchar(20)') AS id
,X.g.value('(text())[1]','varchar(20)') AS term
into #2
FROM #1 as ok
CROSS APPLY(ok.[Message].nodes('Individual/Content/ns:uc/ns:xmlReply/ns:reports/ns:report/ns:group/ns:term') X(g)
With no expected results, perhaps this is enough to get you started.
As you define a default namespace only once you get to status, you can't use a DEFAULT namespace in XMLNAMESPACES, so I name it ns and reference that instead. This gives you the value of all the terms and their id attribute:
DECLARE #XML xml = '<Individual xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<Content>
<status xmlns:d3p1="http://www.uc.se/schemas/ucOrderReply/" xmlns="http://www.uc.se/schemas/ucOrderReply/" d3p1:result="ok" />
<uc xmlns="http://www.uc.se/schemas/ucOrderReply/">
<xmlReply>
<reports xmlns:d5p1="http://www.uc.se/schemas/ucOrderReply/" d5p1:lang="eng">
<report d5p1:id="7605089247" d5p1:name="Test1 Test2" d5p1:styp="K39" d5p1:index="0">
<group d5p1:id="W080" d5p1:index="0" d5p1:key="" d5p1:name="ID particulars">
<term d5p1:id="W08001">9760508923</term>
<term d5p1:id="W08002">7605089277</term>
<term d5p1:id="W08003">Test1</term>
<term d5p1:id="W08004">Test2</term>
</group>
<group d5p1:id="W1A0" d5p1:index="0" d5p1:key="" d5p1:name="UC RPB">
<term d5p1:id="W1A003">000000000000000022</term>
<term d5p1:id="W1A081">2,2 %</term>
<term d5p1:id="W1A082">2,18839</term>
</group>
</report>
</reports>
</xmlReply>
</uc>
</Content>
</Individual>';
WITH XMLNAMESPACES('http://www.uc.se/schemas/ucOrderReply/' AS ns,'http://www.uc.se/schemas/ucOrderReply/' AS d5p1)
SELECT X.g.value('(#d5p1:id)','varchar(20)') AS id,
X.g.value('(text())[1]','varchar(20)') AS term
FROM #XML.nodes('Individual/Content/ns:uc/ns:xmlReply/ns:reports/ns:report/ns:group/ns:term') X(g);
I note that the XML has been changed since the initial version I used to write this answer. This answer has not been (read "won't be") adjusted for that.

SQL string value

Have a large XML file stored in a field within a table, many values have already been extracted and stored in the table, but I'm looking to capture (2) additional: account type = "current" status="X" and account type = "former" status ="Y". In the partial output below there is no former account type so I need a strategy for missing as well.
<ncf_report xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://cp.com/rules/client">
<admin>
<product_reference>12345678901234</product_reference>
<report_type>XXXXXXX</report_type>
<status>XXXXXXXX</status>
<ownership>XXXXXXX</ownership>
<report_code>1234</report_code>
<report_description>XXXXXXXXXXXXXXXXX</report_description>
<purpose>XXXXXXXX</purpose>
<date_request_ordered>mm/dd/yyyy</date_request_ordered>
<date_request_received>mm/dd/yyyy</date_request_received>
<date_request_completed>mm/dd/yyyy</date_request_completed>
<time_report_processed>01234</time_report_processed>
<multiple_scores_ordered>false</multiple_scores_ordered>
<vendor name="XXXXXXXXXXXXX" address="XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX" />
<report>
<sequence>0000000000</sequence>
<count>0000000000</count>
</report>
</admin>
<report>
<alerts_scoring>
<scoring>
<score status="XXXXXXXXXX">
<model_label>XXXXXXXXXXXXXXXXX</model_label>
<score>123</score>
<rating_state>XX</rating_state>
<classification>XXXXXXXXXXXXXXXXX</classification>
<reason_codes>
<code>05</code>
<description>XXXXXXXXXXXXXXXXX</description>
</reason_codes>
<reason_codes>
<code>04</code>
<description>XXXXXXXXXXXXXXXXX</description>
</reason_codes>
<reason_codes>
<code>10</code>
<description>XXXXXXXXXXXXXXXXX</description>
</reason_codes>
<reason_codes>
<code>27</code>
<description>XXXXXXXXXXXXXXXXX</description>
</reason_codes>
</score>
</scoring>
<general>XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX</general>
<general>XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX</general>
<general>XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX</general>
</alerts_scoring>
<vendor_dataset>
<subjects>
<subject type="Primary" relationship_to_data="Subject">
<name type="Report Subject">
<first>XXXX</first>
<middle>X</middle>
<last>XXXX</last>
</name>
<name type="Alias">
<first>XXXXXXXXXX</first>
<last>XXXXXXXX</last>
</name>
<birth_date>mm/dd/yyyy</birth_date>
<ssn>999999999</ssn>
<address type="residence" ref="1" />
<address type="former" ref="2" />
<address type="former" ref="3" />
</subject>
</subjects>
<addresses>
<address id="1">
<house>1234</house>
<street1>sample</street1>
<city>sample</city>
<state>XX</state>
<postalcode>12345</postalcode>
<zip4>1234</zip4>
<date_first_at_address>mm/dd/yyyy</date_first_at_address>
<date_last_at_address>mm/dd/yyyy</date_last_at_address>
</address>
<address id="X">
<house>1234</house>
<street1>XXXXXXXXX</street1>
<city>XXXXXXXXX</city>
<state>XX</state>
<postalcode>12345</postalcode>
<zip4>1234</zip4>
<date_first_at_address>mm/dd/yyyy</date_first_at_address>
<date_last_at_address>mm/dd/yyyy</date_last_at_address>
</address>
</addresses>
</vendor_dataset>
<summary>
<date_oldest_trade>mm/dd/yyyy</date_oldest_trade>
<date_latest_trade>mm/dd/yyyy</date_latest_trade>
<date_latest_activity>mm/dd/yyyy</date_latest_activity>
<includes_bankruptcies flag="false" />
<includes_other_records public_records="false" collection="false" consumer_statement="false" />
<credit_range high="12345" low="123" number_trade_lines="123" />
<account_status_counters>
<!-- here --> <account type="current" description="Pays Account as Agreed" status="1">12</account>
</account_status_counters>
<account_summaries>
<account type="Open-ended">
<number_accounts>0</number_accounts>
<total_owed>0</total_owed>
<total_past_due>0</total_past_due>
<high_amount>0</high_amount>
</account>
<account type="Revolving">
<number_accounts>00</number_accounts>
<total_owed>1234</total_owed>
<total_past_due>0</total_past_due>
<high_amount>12345</high_amount>
</account>
<account type="Installment">
<number_accounts>00</number_accounts>
<total_owed>12345</total_owed>
<total_past_due>0</total_past_due>
<high_amount>123456</high_amount>
</account>
</account_summaries>
<inquiry_history count="0" />
</summary>
<employment_history>
<employment_primary_subject>
<job entry="current" indirectly_verified="false">
<employer>
<name>XXXXXXXXXXX</name>
</employer>
</job>
<job entry="first_former" indirectly_verified="false">
<employer>
<name>XXXXXXXX</name>
<city>XXXXXXX</city>
<state>XX</state>
</employer>
</job>
</employment_primary_subject>
</employment_history>
<trade_account_activity>
<credit_trades>
<credit_trade automated_tape_supplier="false">
<reporting_member>
<number>1234X1234</number>
<name>XXX/1234</name>
</reporting_member>
<account>
<type>XXXXXXXXX</type>
<terms>XXX</terms>
<months_reviewed>00</months_reviewed>
<designator>XXXXXXXX(XXXXX)</designator>
</account>
<date_reported>mm/dd/yyyy</date_reported>
<date_opened>mm/dd/yyyy</date_opened>
<date_last_activity>mm/dd/yyyy</date_last_activity>
<current_rate>XXXXXXXXXXXXXXXXX</current_rate>
<highest_amount>1234</highest_amount>
<balance_amount>00</balance_amount>
<past_due_amount>00</past_due_amount>
<messages>
<message code="XX">XXXXXXXXXXXXXX</message>
<message code="XX">XXXXXXXXXXXXXX</message>
</messages>
</credit_trade>
</account>
<date_reported>mm/dd/yyyy</date_reported>
<date_opened>mm/dd/yyyy</date_opened>
<date_last_activity>mm/dd/yyyy</date_last_activity>
<current_rate>XXXXXXXXXXXXXXXXXXXXXX</current_rate>
<highest_amount>123456</highest_amount>
<balance_amount>123456</balance_amount>
<past_due_amount>0</past_due_amount>
<messages>
<message code="XX">XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX</message>
</messages>
</credit_trade>
</credit_trades>
</trade_account_activity>
<inquiry_history>
<inquiry date="mm/dd/yyyy" name="XXXXXXXXXXXXXXXXXXXXXXX" member="12345X1234" />
<inquiry date="mm/dd/yyyy" name="XXXXXXXXXXXXXXXXXXXXXXX" member="12345Y1234" />
<inquiry date="mm/dd/yyyy" name="CXXXXXXXXXXXXXXXXXXXXXXX" member="12345Z1234" />
<inquiry date="mm/dd/yyyy" name="XXXXXXXXXXXXXXXXXXXXXXX & X" member="12345W1234" />
<inquiry date="mm/dd/yyyy" name="XXXXXXXXXXXXXXXXXXXXXXX" member="12345V1234" />
<inquiry date="mm/dd/yyyy" name="XXXXXXXXXXXXXXXXXXXXXXX" member="12345U1234" />
<inquiry date="mm/dd/yyyy" name="XXXXXXXXXXXXXXXXXXXXXXX" member="12345T1234" />
</inquiry_history>
</report>
</ncf_report>
I'm looking to extract the X value from account type = "current" status="X" and and Y value if an account type = "former" exists. In this case the value 1. added to XML to highlight area of interest. I started by pairing down the data set into a temp table.
select id,
LEFT(SUBSTRING(CreditscoreXML,charindex('<account type="current"',CreditscoreXML),charindex('</account>',CreditscoreXML)),charindex('">',SUBSTRING(CreditscoreXML,charindex('<account type="current"',CreditscoreXML),charindex('</account>',CreditscoreXML)))) [Current_Status]
select
Current_Status, --just so I see output is correct in temp table
substring(Current_Status, charindex('status="',
Current_Status)+8,len(Current_Status)-charindex('status',Current_Status)) [Current_Status]
from #TempCurrent
From here I further tried to refine the text search. Trying to figure out how to eliminate the " after 1 or a better solution to extract both current and former status, former can be missing, need this grouped by Id.
Current Output
Current_Worse_Score Current_Worse_Score Former_Worse
Original Text 1"
Rather than manipulating string data try using the built-in XML functions in SQL Server to make your life easier. For example:
create table dbo.Foo (
id int not null,
bar xml not null
);
insert dbo.Foo (id, bar) values (47, N'<ncf_report xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://cp.com/rules/client">
<!-- the rest of your xml... -->
</ncf_report>')
;with xmlnamespaces(default 'http://cp.com/rules/client')
select
id,
x.a.value(N'#description', N'nvarchar(50)') as [Description],
x.a.value(N'#status', N'nvarchar(50)') as [Status],
x.a.value(N'.', N'nvarchar(50)') as [Account]
from dbo.Foo
cross apply bar.nodes(N'/ncf_report/report/summary/account_status_counters/account[#type="current"]') x(a)
Which yields the result...
id Description Status Account
47 Pays Account as Agreed 1 12
You can use the built-in XML data type methods to query required values from an XML instance which is stored as XMLType column.
DECLARE #X XML;
SET #X = '<ncf_report xmlns:xsd="http://www.w3.org/2001/XMLSchema" ....>'; -- provided XML instance
CREATE TABLE NCFREPORT
(
ncfreportcol XML NOT NULL
);
INSERT INTO ncfreport (ncfreportcol) values (#X); -- inserting the XML instance stored temporarily in above variable X
WITH xmlnamespaces ('http://cp.com/rules/client' as NR)
SELECT T.acc.value('(#status)[1]', 'int') AS Status,
T.acc.value('(#type)[1]', 'varchar(20)') AS AccType,
T.acc.value('(text())[1]', 'int') AS Acc
FROM ncfreport cross apply ncfreport.ncfreportcol.nodes ('/NR:ncf_report/NR:report/NR:summary/NR:account_status_counters/NR:account') as t(acc);
This will result in the following output:
Status AccType Acc
1 current 12
It will produce one row in the output for each account if you have multiple account tags defined in the XML instance. I also noticed that there are missing opening or closing tags in the above XML fragment. It would be a good idea to also have a look at validating the XML before entering into the table. Please have a look at various XML data type methods here - https://learn.microsoft.com/en-us/sql/t-sql/xml/xml-data-type-methods?view=sql-server-ver15

Parsing XML in SQL without a namespace

The code will be self explanatory to the right person for this, but any questions please shout...
Thanks,
Dave
DECLARE #XML XML
SET #XML =
'<?xml version="1.0" encoding="utf-8"?>
<updates>
<versions>
<installer type="A" Xversion="101" iniSizeInBytes="22480" dataSizeInBytes="23396349" msiSizeInBytes="4732928" />
<installer type="B" Yversion="201" iniSizeInBytes="22480" dataSizeInBytes="116687353" msiSizeInBytes="5807616" webconfigModifierSizeInBytes="11800" />
<installer type="A" Xversion="102" iniSizeInBytes="22480" dataSizeInBytes="23396349" msiSizeInBytes="4732928" />
<installer type="B" Yversion="202" iniSizeInBytes="22480" dataSizeInBytes="116687353" msiSizeInBytes="5807616" webconfigModifierSizeInBytes="11800" />
</versions>
<update setNumber="1" XVersion="101" YVersion="201">
<detail Ref="1000">some detail info for 101 and 201</detail>
</update>
<update setNumber="2" XVersion="102" YVersion="202">
<detail Ref="1001">some detail info for 102 and 202</detail>
</update>
</updates>
'
SELECT
r.value('#ref','NVARCHAR(250)') as 'Ref', --This is wrong, but you can probably see i'm wanting the value of Ref, eg 1000 for line 1, 1001 for line 2
t.r.query('./detail').value('.','nvarchar(max)') as 'Detail'
FROM #XML.nodes('/updates/update') AS t(r);
You need detail element in value method to get the value in Ref attribute
SELECT
r.value('#Ref','NVARCHAR(250)') as 'Ref', -- It should be #Ref instead of #ref
r.query('.').value('.','nvarchar(max)') as 'Detail'
FROM #XML.nodes('/updates/update/detail') AS t(r);
Note : Elements and attributes in xml is case sensitive you cannot use #ref in query when xml attribute is Ref
Rextester Demo

Query xml using sql server

i am new to xml queries. i have one xml like
<fields>
<fields name = "a" active ="1" mandat ="true"/>
<fields name = "a" active ="1"/>
</fields>
Now i need to find all the field names that manadt is true. How can i query xml using sql server. please help
Your question is not quite clear, especially the tags (sql, xml, c#-4.0), but from you question's text I take, that you need to query the XML's content within SQL-Server.
You can try it like this
DECLARE #xml XML=
N'<fields>
<fields name="a" active="1" mandat="true" />
<fields name="a" active="1" />
</fields>';
SELECT fld.value(N'#name',N'nvarchar(max)') AS Field_Name
,fld.value(N'#active',N'bit') AS Field_Active
,fld.value(N'#mandat',N'bit') AS Field_Mandant
FROM #xml.nodes(N'/fields/fields') AS A(fld)
The result
Field_Name Field_Active Field_Mandant
a 1 1
a 1 NULL
UPDATE
If you want to read the value of #name of the fields-node, where #mandat is "true", do it like this:
DECLARE #xml XML=
N'<fields>
<fields name="a" active="1" mandat="true" />
<fields name="a" active="1" />
</fields>';
SELECT #xml.value(N'(/fields/fields[#mandat="true"]/#name)[1]',N'nvarchar(max)') AS Mandant_Name
UPDATE 2: More than one <fields> with #mandat="true"
Just try my first solution with a predicate in .nodes():
DECLARE #xml XML=
N'<fields>
<fields name="a" active="1" mandat="true" />
<fields name="b" active="2" />
<fields name="c" active="3" mandat="false" />
<fields name="d" active="4" mandat="true" />
</fields>';
SELECT fld.value(N'#name',N'nvarchar(max)') AS Field_Name
,fld.value(N'#active',N'bit') AS Field_Active
,fld.value(N'#mandat',N'bit') AS Field_Mandant
FROM #xml.nodes(N'/fields/fields[#mandat="true"]') AS A(fld)
This will return only the first and the last <fields> node

SQL Server xquery sum cast error when schema data type is string

Trying to run this in SQL Server 2014 in order to sum all Values in "UserData" xml:
IF EXISTS (SELECT * FROM sys.xml_schema_collections WHERE name = 'SC')
DROP XML SCHEMA COLLECTION SC
go
CREATE XML SCHEMA COLLECTION SC AS N'<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"><xsd:element name="UserData"><xsd:complexType><xsd:complexContent><xsd:restriction base="xsd:anyType"><xsd:sequence><xsd:element name="Item" minOccurs="0" maxOccurs="unbounded"><xsd:complexType><xsd:complexContent><xsd:restriction base="xsd:anyType"><xsd:sequence><xsd:element name="Value" type="xsd:string" /><xsd:any minOccurs="0" /></xsd:sequence><xsd:attribute name="Key" type="xsd:string" /><xsd:attribute name="Type" type="xsd:string" /></xsd:restriction></xsd:complexContent></xsd:complexType></xsd:element></xsd:sequence></xsd:restriction></xsd:complexContent></xsd:complexType></xsd:element></xsd:schema>'
go
Declare #xml xml(SC)
set #xml= '<UserData>
<Item Key="CONVERTED_PAGES_1" Type="CONVERTED_PAGES">
<Value>2</Value>
</Item>
<Item Key="CONVERTED_PAGES_2" Type="CONVERTED_PAGES">
<Value>4</Value>
</Item>
</UserData>'
Select #xml.value('sum(/UserData/Item[#Type="CONVERTED_PAGES"]/Value)','int') as Sum
and getting the following error:
Msg 9308, Level 16, State 1, Line 16
XQuery [value()]: The argument of 'sum()' must be of a single numeric primitive type or 'http://www.w3.org/2004/07/xpath-datatypes#untypedAtomic'. Found argument of type 'xs:string *'.
I tried changing the select to the following:
Select #xml.value('sum(/UserData/Item[#Type="CONVERTED_PAGES"]/Value cast as xs:int?)','int') as Sum
But then I get this:
Msg 2365, Level 16, State 1, Line 16 XQuery [value()]: Cannot
explicitly convert from 'xs:string *' to 'xs:int ?'
I am not able to change the xml schema in this case, but figured I could cast in order to perform this operation (since I know that in my case all of the Values will be int). Any suggestions would be appreciated!
The xquery sum aggregate requires the input to be a number. Currently it is defined as string in your XSD. To get this to work, you have three options:
Option 1:
You change the schema to force "value" to be an int. Instead of the first line below, use the second. (The difference is highlighted in between the two statements with "|||||||".)
Query 1:
CREATE XML SCHEMA COLLECTION SC AS N'<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"><xsd:element name="UserData"><xsd:complexType><xsd:complexContent><xsd:restriction base="xsd:anyType"><xsd:sequence><xsd:element name="Item" minOccurs="0" maxOccurs="unbounded"><xsd:complexType><xsd:complexContent><xsd:restriction base="xsd:anyType"><xsd:sequence><xsd:element name="Value" type="xsd:string" /><xsd:any minOccurs="0" /></xsd:sequence><xsd:attribute name="Key" type="xsd:string" /><xsd:attribute name="Type" type="xsd:string" /></xsd:restriction></xsd:complexContent></xsd:complexType></xsd:element></xsd:sequence></xsd:restriction></xsd:complexContent></xsd:complexType></xsd:element></xsd:schema>'
|||||||
CREATE XML SCHEMA COLLECTION SC AS N'<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"><xsd:element name="UserData"><xsd:complexType><xsd:complexContent><xsd:restriction base="xsd:anyType"><xsd:sequence><xsd:element name="Item" minOccurs="0" maxOccurs="unbounded"><xsd:complexType><xsd:complexContent><xsd:restriction base="xsd:anyType"><xsd:sequence><xsd:element name="Value" type="xsd:integer" /><xsd:any minOccurs="0" /></xsd:sequence><xsd:attribute name="Key" type="xsd:string" /><xsd:attribute name="Type" type="xsd:string" /></xsd:restriction></xsd:complexContent></xsd:complexType></xsd:element></xsd:sequence></xsd:restriction></xsd:complexContent></xsd:complexType></xsd:element></xsd:schema>'
Option 2:
If changing the XSD is not an option, you can also use the T-SQL SUM aggregate instead of the xquery one, like this:
Query 2:
IF EXISTS (SELECT * FROM sys.xml_schema_collections WHERE name = 'SC')
DROP XML SCHEMA COLLECTION SC
go
CREATE XML SCHEMA COLLECTION SC AS N'<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"><xsd:element name="UserData"><xsd:complexType><xsd:complexContent><xsd:restriction base="xsd:anyType"><xsd:sequence><xsd:element name="Item" minOccurs="0" maxOccurs="unbounded"><xsd:complexType><xsd:complexContent><xsd:restriction base="xsd:anyType"><xsd:sequence><xsd:element name="Value" type="xsd:string" /><xsd:any minOccurs="0" /></xsd:sequence><xsd:attribute name="Key" type="xsd:string" /><xsd:attribute name="Type" type="xsd:string" /></xsd:restriction></xsd:complexContent></xsd:complexType></xsd:element></xsd:sequence></xsd:restriction></xsd:complexContent></xsd:complexType></xsd:element></xsd:schema>'
go
Declare #xml xml(SC)
set #xml= '<UserData>
<Item Key="CONVERTED_PAGES_1" Type="CONVERTED_PAGES">
<Value>2</Value>
</Item>
<Item Key="CONVERTED_PAGES_2" Type="CONVERTED_PAGES">
<Value>4</Value>
</Item>
</UserData>'
SELECT SUM(N.value('.','INT')) AS [Sum]
FROM #xml.nodes('/UserData/Item[#Type="CONVERTED_PAGES"]/Value') AS X(N);
Option 3:
As you noticed, SQL Server does not allow us to convert an XSD-typed value to another data type. To get around that, you could instruct SQL Server to forget about the schema:
Query 3:
IF EXISTS (SELECT * FROM sys.xml_schema_collections WHERE name = 'SC')
DROP XML SCHEMA COLLECTION SC;
GO
CREATE XML SCHEMA COLLECTION SC AS N'<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"><xsd:element name="UserData"><xsd:complexType><xsd:complexContent><xsd:restriction base="xsd:anyType"><xsd:sequence><xsd:element name="Item" minOccurs="0" maxOccurs="unbounded"><xsd:complexType><xsd:complexContent><xsd:restriction base="xsd:anyType"><xsd:sequence><xsd:element name="Value" type="xsd:string" /><xsd:any minOccurs="0" /></xsd:sequence><xsd:attribute name="Key" type="xsd:string" /><xsd:attribute name="Type" type="xsd:string" /></xsd:restriction></xsd:complexContent></xsd:complexType></xsd:element></xsd:sequence></xsd:restriction></xsd:complexContent></xsd:complexType></xsd:element></xsd:schema>';
GO
DECLARE #xml XML(SC);
SET #xml= '<UserData>
<Item Key="CONVERTED_PAGES_1" Type="CONVERTED_PAGES">
<Value>2</Value>
</Item>
<Item Key="CONVERTED_PAGES_2" Type="CONVERTED_PAGES">
<Value>4</Value>
</Item>
</UserData>';
SELECT CAST(#xml AS XML).value('sum((/UserData/Item[#Type="CONVERTED_PAGES"]/Value ))','int') AS Sum;
Note: Without the schema, you still cannot cast (not sure why), but the sum now works without casting.
Update:
I did a little more digging. The original error message you got after attempting to cast is this one:
Msg 2365, Level 16, State 1, Line 16 XQuery [value()]: Cannot
explicitly convert from 'xs:string *' to 'xs:int ?'
It tells us that you can't convert a sequence of strings into a single integer.
The * as well as the ? are Occurrence Indicators. So the error message reads: zero-to-many strings can't be converted to zero-to-one integer.
Your xquery /UserData/Item[#Type="CONVERTED_PAGES"]/Value returns more than one value, and to sum them up we need to convert each one individually.
xquery offers multiple ways to accomplish that, but not all of them work in SQL Server. The one that works uses a for-each construct:
.value('sum(for $val in /UserData/Item[#Type="CONVERTED_PAGES"]/Value return $val cast as xs:int?)','INT');
Thanks to #MikaelEriksson for helping me out with this.