Does nodes() or openxml returns rows in same order as it finds in xml? - sql

I have an xml which i need to parse using openxml or nodes(). The xml contains few child tags that repeat with different values, as below.
<root>
<value>10</value>
<value>12</value>
<value>11</value>
<value>1</value>
<value>15</value>
<root>
For my code it is very important that i get all these rows returned in same order as in xml. I googled and gogled but nothing tells me if the #mp:id is always returned in same order as in xml. Or if nodes() return values in same order as it encounters them.
All I want to know if I can trust any of those two methods and be happy with proper order of rows.
P.S. excuse any errors or mistakes in above text, I dont enjoy typing codes in an android window either.

You can use row_number on the shredded XML like this.
declare #XML xml=
'<root>
<value>10</value>
<value>12</value>
<value>11</value>
<value>1</value>
<value>15</value>
</root>'
select value
from
(
select T.N.value('.', 'int') as value,
row_number() over(order by T.N) as rn
from #xml.nodes('/root/value') as T(N)
) as T
order by T.rn
Uniquely Identifying XML Nodes with DENSE_RANK
Update:
You can also use a numbers table like this;
declare #XML xml=
'<root>
<value>10</value>
<value>12</value>
<value>11</value>
<value>1</value>
<value>15</value>
</root>';
with N(Number) as
(
select Number
from master..spt_values
where type = 'P'
)
select #XML.value('(/root/value[sql:column("N.Number")])[1]', 'int')
from N
where N.Number between 1 and #XML.value('count(/root/value)', 'int')
order by N.Number

XPath allows you to select nodes explicitly by ordinal: '/root[1]/value[1]' is the first element, '/root[1]/value[2]' is the second etc. Also could use '(/root/value)[1]' and '(/root/value[2])'. This way you can select exactly the element you want, and selecting element 1 then element 2 then element 3 etc will give you controlled order. Slow, but controlled.
Updated P.S. Wouldn't this be nice to be true?
declare #x xml = '<root>
<value>10</value>
<value>12</value>
<value>11</value>
<value>1</value>
<value>15</value>
<root>';
select x.value(N'position()', N'int') as position,
x.value(N'.', 'int') as value
from #x.nodes(N'//root/value') t(x)
Unfortunately, is not...
Msg 2371, Level 16, State 1, Line 9
XQuery [value()]: 'position()' can only be used within a predicate or XPath selector
And the existence of this error makes me worry that order may be broken sometimes...

Related

SQL query to get most recent date from XML document

I am doing a SQL query against a column with an XML document located.
The XML document looks like the following.
<root>
<date>2016-10-12</date>
<date>2016-12-01</date>
<date>2016-11-13</date>
</root>
As you can see the dates are out of order.
I am looking for a SQL query that will get the most recent date from the XML document (in this case: 2016-12-01).
One way is to read all data and find the maximum externally (external ORDER BY with TOP 1, like in Prdp's answer, or MAX(), eventually with GROUP BY).
Another way is a FLWOR-XQuery:
DECLARE #xml XML=
'<root>
<date>2016-10-12</date>
<date>2016-12-01</date>
<date>2016-11-13</date>
</root>';
SELECT #xml.value('max(for $d in /root/date return xs:date($d))','date')
This means:
Take each value in /root/date, return it as date and find the highest!
Both approaches will need to read the whole list, but it should be a bit faster only to look for the maximum value, rather than return a full list and do some external sorting, picking again...
Try this
DECLARE #xml XML
SET #xml = '<root>
<date>2016-10-12</date>
<date>2016-12-01</date>
<date>2016-11-13</date>
</root>'
SELECT Top 1 x.col.value('.', 'date') AS dates
FROM #xml.nodes('/root/date') x(col)
ORDER BY dates DESC

Parsing Dynamic XML to SQL Server tables with Parent and child relation

I have a XML in Source Table. I need to parse this XML to 3 different tables which has Parent Child relationship. I can do this in C# but currently for this i need to implement it at SQL server side.
The sample xml looks like:
<ROWSET>
<ROW>
<HEADER_ID>5001507</HEADER_ID>
<ORDER_NUMBER>42678548</ORDER_NUMBER>
<CUST_PO_NUMBER>LSWQWE1</CUST_PO_NUMBER>
<CUSTOMER_NUMBER>38087</CUSTOMER_NUMBER>
<CUSTOMER_NAME>UNIVERSE SELLER</CUSTOMER_NAME>
<LINE>
<LINE_ROW>
<HEADER_ID>5001507</HEADER_ID>
<LINE_ID>12532839</LINE_ID>
<LINE_NUMBER>1</LINE_NUMBER>
<ITEM_NUMBER>STAGEPAS 600I-CA</ITEM_NUMBER>
<ORDER_QUANTITY>5</ORDER_QUANTITY>
</LINE_ROW>
<LINE_ROW>
<HEADER_ID>5001507</HEADER_ID>
<LINE_ID>12532901</LINE_ID>
<LINE_NUMBER>3</LINE_NUMBER>
<ITEM_NUMBER>CD-C600 RK</ITEM_NUMBER>
<ORDER_QUANTITY>6</ORDER_QUANTITY>
</LINE_ROW>
<LINE_ROW>
<HEADER_ID>5001507</HEADER_ID>
<LINE_ID>12532902</LINE_ID>
<LINE_NUMBER>4</LINE_NUMBER>
<ITEM_NUMBER>CD-S300 RK</ITEM_NUMBER>
<ORDER_QUANTITY>8</ORDER_QUANTITY>
</LINE_ROW>
</LINE>
<PRCADJ>
<PRCADJ_ROW>
<PRICE_ADJUSTMENT_ID>43095064</PRICE_ADJUSTMENT_ID>
<HEADER_ID>5001507</HEADER_ID>
<LINE_ID>12532839</LINE_ID>
<ADJUSTED_AMOUNT>-126</ADJUSTED_AMOUNT>
</PRCADJ_ROW>
<PRCADJ_ROW>
<PRICE_ADJUSTMENT_ID>43095068</PRICE_ADJUSTMENT_ID>
<HEADER_ID>5001507</HEADER_ID>
<LINE_ID>12532840</LINE_ID>
<ADJUSTED_AMOUNT>-96.6</ADJUSTED_AMOUNT>
</PRCADJ_ROW>
</PRCADJ>
</ROW>
</ROWSET>
The issue is the Parent can have multiple child and each child can multiple sub child. How can i write query to transfer this into Sql Server 2005
You need to use three CROSS APPLY operators to break up the "list of XML elements" into separate pseudo tables of XML rows, so you can access their properties - something like this:
SELECT
HeaderID = XCRow.value('(HEADER_ID)[1]', 'int'),
OrderNumber = XCRow.value('(ORDER_NUMBER)[1]', 'int'),
LineHeaderID = XCLine.value('(HEADER_ID)[1]', 'int'),
LineID = XCLine.value('(LINE_ID)[1]', 'int'),
LineNumber = XCLine.value('(LINE_NUMBER)[1]', 'int'),
PriceAdjustmentID = XCPrc.value('(PRICE_ADJUSTMENT_ID)[1]', 'int'),
AdjustedAmount = XCPrc.value('(ADJUSTED_AMOUNT)[1]', 'decimal(20,4)')
FROM
dbo.YourTableNameHere
CROSS APPLY
Data.nodes('/ROWSET/ROW') AS XTRow(XCRow)
CROSS APPLY
XCRow.nodes('LINE/LINE_ROW') AS XTLine(XCLine)
CROSS APPLY
XCRow.nodes('PRCADJ/PRCADJ_ROW') AS XTPrc(XCPrc)
With this, the first CROSS APPLY will handle all the elements that are found directly under <ROWSET> / <ROW> (the header information), the second one will enumerate all instances of <LINE> / <LINE_ROW> below that header element, and the third CROSS APPLY handles the <PRCADJ> / <PRCADJ_ROW> elements, also below the header.
You might need to tweak the outputs a bit - and I only picked two or three of the possible values - extend and adapt to your own needs! But this should show you the basic mechanism - the .nodes() method returns a "pseudo table" of XML fragments, one for each match of the XPath expression you define.
you can do some thing like this. using cross apply you will get node elements and then extract the value using value clause. you need to specify the column type i.e int or varchar etc.
The result can then be inserted using insert into select query.
insert into Table1 values ( header_id, order_number, cust_po_number)
select R.value('(HEADER_ID)[1]', 'int') As header_id,
R.value('(ORDER_NUMBER)[1]', 'int') as order_number,
R.value('(CUST_PO_NUMBER)[1]', 'varchar(256)') as cust_po_number
from table
cross apply XMLdata.nodes('/ROWSET/ROW') AS P(R)
insert into Table2 values ( header_id, line_id, line_number)
select R.value('(HEADER_ID)[1]', 'int') As header_id,
R.value('(LINE_ID)[1]', 'int') as line_id,
R.value('(LINE_NUMBER)[1]', 'int') as line_number
from table
cross apply XMLdata.nodes('/ROWSET/ROW/LINE/LINE_ROW') AS P(R)

Parsing nested XML into SQL table

What would be the right way to parse the following XML block into SQL Server table according to desired layout (below)? Is it possible to do it with a single SELECT statement, without UNION or a loop? Any takers? Thanks in advance.
Input XML:
<ObjectData>
<Parameter1>some value</Parameter1>
<Parameter2>other value</Parameter2>
<Dates>
<dateTime>2011-02-01T00:00:00</dateTime>
<dateTime>2011-03-01T00:00:00</dateTime>
<dateTime>2011-04-01T00:00:00</dateTime>
</Dates>
<Values>
<double>0.019974</double>
<double>0.005395</double>
<double>0.004854</double>
</Values>
<Description>
<string>this is row 1</string>
<string>this is row 2</string>
<string>this is row 3</string>
</Values>
</ObjectData>
Desired table output:
Parameter1 Parameter2 Dates Values Description
Some value Other value 2011-02-01 00:00:00.0 0.019974 this is row 1
Some value Other value 2011-03-01 00:00:00.0 0.005395 this is row 2
Some value Other value 2011-04-01 00:00:00.0 0.004854 this is row 3
I am after an SELECT SQL statement using OPENXML or xml.nodes() functionality. For example, the following SELECT statement results in production between Values and Dates (that is all permutations of Values and Dates), which is something I want to avoid.
SELECT
doc.col.value('Parameter1[1]', 'varchar(20)') Parameter1,
doc.col.value('Parameter2[1]', 'varchar(20)') Parameter2,
doc1.col.value('.', 'datetime') Dates ,
doc2.col.value('.', 'float') [Values]
FROM
#xml.nodes('/ObjectData') doc(col),
#xml.nodes('/ObjectData/Dates/dateTime') doc1(col),
#xml.nodes('/ObjectData/Values/double') doc2(col);
You can make use of a numbers table to pick the first, second, third etc row from the child elements. In this query I have limited the rows returned to the number if dates provided. If there are more values or descriptions than dates you have to modify the join to take that into account.
declare #XML xml = '
<ObjectData>
<Parameter1>some value</Parameter1>
<Parameter2>other value</Parameter2>
<Dates>
<dateTime>2011-02-01T00:00:00</dateTime>
<dateTime>2011-03-01T00:00:00</dateTime>
<dateTime>2011-04-01T00:00:00</dateTime>
</Dates>
<Values>
<double>0.019974</double>
<double>0.005395</double>
<double>0.004854</double>
</Values>
<Description>
<string>this is row 1</string>
<string>this is row 2</string>
<string>this is row 3</string>
</Description>
</ObjectData>'
;with Numbers as
(
select number
from master..spt_values
where type = 'P'
)
select T.N.value('Parameter1[1]', 'varchar(50)') as Parameter1,
T.N.value('Parameter2[1]', 'varchar(50)') as Parameter2,
T.N.value('(Dates/dateTime[position()=sql:column("N.Number")])[1]', 'datetime') as Dates,
T.N.value('(Values/double[position()=sql:column("N.Number")])[1]', 'float') as [Values],
T.N.value('(Description/string[position()=sql:column("N.Number")])[1]', 'varchar(max)') as [Description]
from #XML.nodes('/ObjectData') as T(N)
cross join Numbers as N
where N.number between 1 and (T.N.value('count(Dates/dateTime)', 'int'))
Use the OPENXML function. It is a rowset provider (it returns the set of rows parsed from the XML) and thus can be utilized in SELECT or INSERT like:
INSERT INTO table SELECT * FROM OPENXML(source, rowpattern, flags)
Please see the first example in the documentation link for clarity.
Typically, if you wanted to parse XML, you'd do it a programming language like Perl, Python, Java or C# that a) has an XML DOM, and b) can communicate with a relational database.
Here's a short article that shows you some of the basics of reading and writing XML in C# ... and even has an example of how to create an XML document from a SQL query (in one line!):
http://www.c-sharpcorner.com/uploadfile/mahesh/readwritexmltutmellli2111282005041517am/readwritexmltutmellli21.aspx

Safely retrieve values from element that can contain different values

We have some xml elements in a database that [for older data] can sometimes contain guids and sometimes contain integers.
is there a nice way of pulling out all the integrs only?
This will fail if the value element contains a guid!
select
ra.*,
t.c.value('.', 'int') as organisationId
from
Audit.EmployeeAudit ra
cross apply ra.EmployeeXml.nodes('//*:employee/*:property[*:name="ORG"]/*:value') t(c)
Sample Xml
<employee>
<property>
<name>ORG</name>
<value>39</value> <!-- Sometimes this will be a guid -->
<description>Leeds</description>
</property>
</employee>
You could add a predicate to only match entries of less than or equal to 10 characters.
;with EmployeeAudit as
(
SELECT CAST('<employee><property>
<name>ORG</name>
<value>39</value> <!-- Sometimes this will be a guid -->
<description>Leeds</description>
</property></employee>
' AS XML) AS EmployeeXml
UNION ALL
SELECT CAST('<employee><property>
<name>ORG</name>
<value>2FD29F11-59FC-47FD-BC30-DD330A53284E</value>
<description>Leeds</description>
</property></employee>
' AS XML)
)
select
ra.*,
t.c.value('.', 'int') as organisationId
from
EmployeeAudit ra
cross apply
ra.EmployeeXml.nodes
('//*:employee/*:property[*:name="ORG"]/*:value[string-length() <= 10]') t(c)
Or actually this might be a bit more robust
('//*:employee/*:property[*:name="ORG"]/*:value[ceiling(.) = .]') t(c)

How do I select a top-level attribute of an XML column in SQL Server?

I have an XML column in SQL Server that is the equivalent of:
<Test foo="bar">
<Otherstuff baz="belch" />
</Test>
I want to get the value of the foo attribute of Test (the root element) as a varchar. My goal would be something along the lines of:
SELECT CAST('<Test foo="bar"><Otherstuff baz="belch" /></Test>' AS xml).value('#foo', 'varchar(20)') AS Foo
But when I run the above query, I get the following error:
Msg 2390, Level 16, State 1, Line 1
XQuery [value()]: Top-level attribute
nodes are not supported
John Saunders has it almost right :-)
declare #Data XML
set #Data = '<Test foo="bar"><Otherstuff baz="belch" /></Test>'
select #Data.value('(/Test/#foo)[1]','varchar(20)') as Foo
This works for me (SQL Server 2005 and 2008)
Marc
If you dont know the root element:
select #Data.value('(/*/#foo)[1]','varchar(20)') as Foo
Why does .value('#foo', 'varchar(20)') generate the error “Top-level attribute nodes are not supported”?
When you query the xml data type, the context is the document node, which is an implicit node that contains the root element(s) of your XML document. The document node has no name and no attributes.
How can I get the value of an attribute on the root element?
In your XQuery expression, include the path to the first root element:
DECLARE #Data xml = '<Customer ID="123"><Order ID="ABC" /></Customer>'
SELECT #Data.value('Customer[1]/#ID', 'varchar(20)')
-- Result: 123
If you don’t know (or don’t want to specify) the name of the root element, then just use * to match any element:
SELECT #Data.value('*[1]/#ID', 'varchar(20)')
-- Result: 123
Because the query context is the document node, you don’t need to prefix the XQuery expression with a forward slash (as the other answers unnecessarily do).
Why do I have to include [1]?
The XQuery expression you pass to value() must be guaranteed to return a singleton. The expression Customer/#ID doesn’t satisfy this requirement because it matches both ID="123" and ID="456" in the following example:
DECLARE #Data xml = '<Customer ID="123" /><Customer ID="456" />'
Remember that the xml data type represents an XML document fragment, not an XML document, so it can contain multiple root elements.
What’s the difference between Customer[1]/#ID and (Customer/#ID)[1]?
The expression Customer[1]/#ID retrieves the ID attribute of the first <Customer> element.
The expression (Customer/#ID)[1] retrieves the ID attribute of all <Customer> elements, and from this list of attributes, picks the first.
The following example demonstrates the difference:
DECLARE #Data xml = '<Customer /><Customer ID="123" /><Customer ID="456" />'
SELECT #Data.value('Customer[1]/#ID', 'varchar(20)')
-- Result: NULL (because the first Customer element doesn't have an ID attribute)
SELECT #Data.value('(Customer/#ID)[1]', 'varchar(20)')
-- Result: 123