Parsing SQL with bad xml namespace - sql

Hi I have the following SQL to try and parse xml and extract the "OrderNumber". The problem i have is this xml (which i have no control over) has a wierd xml namespace. I changed it to abc.com just for this example, but its something else. Anyway, when that namepace is present, the T-SQL returns a null in the result. So it doesn't play nicely with the namespace. If I remove the namespace manually or doing a search and replace via T-SQL, it works just fine. I guess i can just do a search and replace but that solution just bothers me. Was wondering if anyone else nows a better way around this? And maybe an explanation of why it doesn't like namespaces? Would really appreciate some advice. Thanks!
Declare #Transmission xml
set #Transmission = '<Transmission>
<Requests>
<SubmitOrdersRequest>
<Orders>
<Order xmlns="http://www.abc.com">
<OrderNumber>123</OrderNumber>
</Order>
</Orders>
</SubmitOrdersRequest>
</Requests>
</Transmission>'
select #Transmission.value('(Transmission/Requests/SubmitOrdersRequest/Orders/Order/OrderNumber/text())[1]', 'varchar(100)')

Children nodes inherit the namespace of the parent, unless given a namespace themselves. you have to define namespaces using WITH XMLNAMESPACES, and properly qualify node names using them.
Declare #Transmission xml
set #Transmission = '<Transmission>
<Requests>
<SubmitOrdersRequest>
<Orders>
<Order xmlns="http://www.abc.com">
<OrderNumber>123</OrderNumber>
</Order>
</Orders>
</SubmitOrdersRequest>
</Requests>
</Transmission>';
with xmlnamespaces('http://www.abc.com' as ns1)
select #Transmission.value('(Transmission/Requests/SubmitOrdersRequest/Orders/ns1:Order/ns1:OrderNumber/text())[1]', 'varchar(100)')
Note: The reason for namespaces is that names are contextual things. Order can mean in your case a purchase but in another context it could mean display rack order. The namespace gives the name more uniqueness.

Related

SQL get value from XML in tag, by tag value

I have the following XML:
<Main>
<ResultOutput>
<Name>TEST1</Name>
<Value>D028</Value>
</ResultOutput>
<ResultOutput>
<Name>TEST2</Name>
<Value>Accept</Value>
</ResultOutput>
<ResultOutput>
<Name>TEST3</Name>
<Value />
</ResultOutput>
</Main>
What I want is to get the value of the <value> tag in SQL.
Basically want to say get <value> where <Name> has the value of TEST1, as an example
This is what I have at the moment, but this depends on the position of the XML tag:
XMLResponse.value(Main/ResultOutput/Value)[5]', nvarchar(max)')
The best way to do this is not to put extra where .value clauses, but to do it directly in XQuery.
Use [nodename] to filter by a child node, you can even nest such predicates. text() gets you the inner text of the node:
XMLResponse.value('(/Main/ResultOutput[Name[text()="TEST1"]]/Value/text())[1]', 'nvarchar(max)')
Below is an example using the sample XML in your question. You'll need to extend this to add namespace declarations and the proper xpath expressions that may be present in your actual XML as your query attempt suggests.
SELECT ResultOutput.value('Value[1]', 'nvarchar(100)')
FROM #xml.nodes('Main/ResultOutput') AS Main(ResultOutput)
WHERE ResultOutput.value('Name[1]', 'nvarchar(100)') = N'TEST1';

SQL SERVER xml with CDATA

I have a table in my database with a column containing xml. The column type is nvarchar(max). The xml is formed in this way
<root>
<child>....</child>
.
.
<special>
<event><![CDATA[text->text]]></event>
<event><![CDATA[text->text]]></event>
...
</special>
</root>
I have not created the db, I cannot change the way information is stored in it but I can retrieve it with a select. For the extraction I use
select cast(replace(xml,'utf-8','utf-16')as xml)
from table
It works well except for cdata, whose content in the query output is: text -> text
Is there a way to retrieve also the CDATA tags?
Well, this is - as far as I know - not possible on normal ways...
The CDATA section has one sole reason: include invalid characters within XML for lazy people...
CDATA is not seen as needed at all and therefore is not really supported by normal XML methods. Or in other words: It is supported in the way, that the content is properly escaped. There is no difference between correctly escaped content and not-escaped content within CDATA actually! (Okay, there are some minor differences like including ]]> within a CDATA-section and some more tiny specialties...)
The big question is: Why?
What are you trying to do with this afterwards?
Try this. the included text is given as is:
DECLARE #xml XML =
'<root>
<special>
<event><![CDATA[text->text]]></event>
<event><![CDATA[text->text]]></event>
</special>
</root>'
SELECT t.c.query('text()')
FROM #xml.nodes('/root/special/event') t(c);
So: Please explain some more details: What do you really want?
If your really need nothing more than the wrapping CDATA you might use this:
SELECT '<![CDATA[' + t.c.value('.','varchar(max)') + ']]>'
FROM #xml.nodes('/root/special/event') t(c);
Update: Same with outdated FROM OPENXML
I just tried how the outdated approach with FROM OPENXML handles this and found, that there is absolutely no indication in the resultset, that the given text was within a CDATA section originally. The "Some value here" is exactly returned in the same way as the text within CDATA:
DECLARE #doc XML =
'<root>
<child>Some value here </child>
<special>
<event><![CDATA[text->text]]></event>
<event><![CDATA[text->text]]></event>
</special>
</root>';
DECLARE #hnd INT;
EXEC sp_xml_preparedocument #hnd OUTPUT, #doc;
SELECT * FROM OPENXML (#hnd, '/root',0);
EXEC sp_xml_removedocument #hnd;
This is how to include cdata on child nodes in XML, using pure SQL. But; it's not ideal.
SELECT 1 AS tag,
null AS parent,
'10001' AS 'Customer!1!Customer_ID!Element',
'AirBallon Captain' AS 'Customer!1!Title!cdata',
'Customer!1' = (
SELECT
2 AS tag,
NULL AS parent,
'Wrapped in cdata, using explicit' AS 'Location!2!Title!cdata'
FOR XML EXPLICIT)
FOR XML EXPLICIT, ROOT('Customers')
CDATA is included, but Child element is encoded using
>
instead of >
Which is so weird from a sensable point of view. I'm sure there are technical explanations, but they are stupid, because there is no difference in the FOR XML specification.
You could include the option type on the inner child node and then loose cdata too..
BUT WHY OH WHY?!?!?!?! would you (Microsoft) remove cdata, when I just added it?
<Customers>
<Customer>
<Customer_ID>10001</Customer_ID>
<Title><![CDATA[AirBallon Captain]]></Title>
<Location>
<Title><![CDATA[wrapped in cdata, using explicit]]></Title>
</Location>
</Customer>
</Customers>

SQL update Statement converting to XQuery

So as the above states I am rather stuck when it comes to converting an update sql query to xquery my example for the update is as shown below.
UPDATE Products
SET [List Price] = 19
WHERE [ID]= 1;
Where needed just assume a variable, I get how XQuery works and the pathing just use random examples if it's the only way. I just can't find a good example that explains how to update the same way.
<Products>
<Supplier_x0020_IDs>
<Value>4</Value>
</Supplier_x0020_IDs>
<ID>1</ID>
<Product_x0020_Code>NWTB-1</Product_x0020_Code>
<Product_x0020_Name>Northwind Traders Chai</Product_x0020_Name>
<Standard_x0020_Cost>13.5</Standard_x0020_Cost>
<List_x0020_Price>18</List_x0020_Price>
<Reorder_x0020_Level>10</Reorder_x0020_Level>
<Target_x0020_Level>40</Target_x0020_Level>
<Quantity_x0020_Per_x0020_Unit>10 boxes x 20 bags</Quantity_x0020_Per_x0020_Unit>
<Discontinued>0</Discontinued>
<Minimum_x0020_Reorder_x0020_Quantity>10</Minimum_x0020_Reorder_x0020_Quantity>
<Category>Beverages</Category>
</Products>
This is how the XML looks like, it's stupidly messy hence why I avoided posting it but now it's stuck again
If you want to do update in XQuery, then you need to use XQuery Update and it really depends whether your implementation supports this or not.
As you have not told us what your XML looks like, I will assume something like:
<Products>
<Product>
<ID>1</ID>
<ListPrice>20</ListPrice>
</Product>
<Product>
<ID>2</ID>
<ListPrice>7</ListPrice>
</Product>
</Products>
If so you can use the following XQuery Update expression:
replace value of node /Products/Product[ID eq "1"]/ListPrice with "19"
Note, that I have assumed that your XQuery processors is schema unaware and does not know the types of your nodes, so I have assumed strings throughout. Also I have removed the space from "List Price" as XML elements cannot contain spaces in their names.

SQL and escaped XML data

I have a table with a mix of escaped and non-escaped XML. Of course, the data I need is escaped. For example, I have:
<Root>
<InternalData>
<Node>
<ArrayOfComment>
<Comment&gt
<SequenceNo>1</SequenceNo>
<IsDeleted>false</IsDeleted>
<TakenByCode>397</TakenByCode>
</Comment&gt
</ArrayOfComment>
</Node>
</InternalData>
</Root>
As you can see, the data in the Node tag is all escaped. I can use a query to obtain the Node data, but how can I convert it to XML in SQL so that it can be parsed and broken up? I'm pretty new to using XML in SQL, and I can't seem to find any examples of this.
Thanks
You have not given enough information about your end goal, but this will get you very close. FYI - You had two missing ; both after comment&gt
declare #xml xml
set #xml = '
<Root>
<InternalData>
<Node>
<ArrayOfComment>
<Comment>
<SequenceNo>1</SequenceNo>
<IsDeleted>false</IsDeleted>
<TakenByCode>397</TakenByCode>
</Comment>
</ArrayOfComment>
</Node>
</InternalData>
</Root>
'
select convert(xml, n.c.value('.', 'varchar(max)'))
from #xml.nodes('Root/InternalData/Node/text()') n(c)
Output
<ArrayOfComment>
<Comment>
<SequenceNo>1</SequenceNo>
<IsDeleted>false</IsDeleted>
<TakenByCode>397</TakenByCode>
</Comment>
</ArrayOfComment>
The result is an XML column that you can put into a variable or cross-apply into directly to get data from the XML fragment.
Your best bet might be to look into a HTML Decoding UDF. I did a quick search and found this one:
http://www.andreabertolotto.net/Articles/HTMLDecodeUDF.aspx
You may want to modify it so it only decodes > and <. The one above seems to go above and beyond your needs.
UPDATE
#Cyberkiwi's solution seems to be a bit cleaner. I will leave this up in case the version of SQL Server you are running doesn't support his solution.

Import Xml nodes as Xml column with SSIS

I'm trying to use the Xml Source to shred an XML source file however I do not want the entire document shredded into tables. Rather I want to import the xml Nodes into rows of Xml.
a simplified example would be to import the document below into a table called "people" with a column called "person" of type "xml". When looking at the XmlSource --- it seem that it suited to shredding the source xml, into multiple records --- not quite what I'm looking for.
Any suggestions?
<people>
<person>
<name>
<first>Fred</first>
<last>Flintstone</last>
</name>
<address>
<line1>123 Bedrock Way</line>
<city>Drumheller</city>
</address>
</person>
<person>
<!-- more of the same -->
</person>
</people>
I didn't think that SSIS 2005 supported the XML datatype at all. I suppose it "supports" it as DT_NTEXT.
In any case, you can't use the XML Source for this purpose. You would have to write your own. That's not actually as hard as it sounds. Base it on the examples in Books Online. The processing would consist of moving to the first child node, then calling XmlReader.ReadSubTree to return a new XmlReader over just the next <person/> element. Then use your favorite XML API to read the entire <person/>, convert the resulting XML to a string, and pass it along down the pipeline. Repeat for all <person/> nodes.
Could you perhaps change your xml output so that the content of person is seen as a string? Use escape chars for the <>.
You could use a script task to parse it as well, I'd imagine.