How to prevent XML reformatting in SQL - sql

Updated an xml file in order to remove an unnecessary field using
deletexml(xmltype(xxx)).getClobVal()
but the XML returns as one long string instead of a properly formatted XML file with indents and spaces. Any idea what I'm doing wrong here? Thanks

getClobVal, getStringVal are deprecated since oracle 11.2 .instead of these function you have to use xmlserialize.
Example:
select xmlserialize(document xmltype('<a><b><c>xxx</c></b></a>') indent size=2) from dual;
And you will end with clob object containing pretty-print xml.

"properly formatted XML file with indents and spaces"
That might surprise you, but that is properly formatted (well-formed) XML. The XML standard says nothing about whitespace between structural elements, except that it is allowed. It's called "insignificant white-space" for a reason.
If you want to format your XML for human-readability, you must do that yourself. But XML isn't for humans, it's for machines, so there is no reason to have your SQL do such formatting. Use any tool you like that auto-formats XML for human readability if you want to inspect the XML as human.

I use this procedure to make a pretty XML:
PROCEDURE MakePrettyXml(xmlString IN OUT NOCOPY CLOB) IS
xmlDocFragment DBMS_XMLDOM.DOMDOCUMENTFRAGMENT;
xslProc DBMS_XSLPROCESSOR.PROCESSOR;
xsl DBMS_XSLPROCESSOR.STYLESHEET;
xmlStringOut CLOB;
BEGIN
DBMS_LOB.CREATETEMPORARY(xmlStringOut, TRUE);
xslProc := DBMS_XSLPROCESSOR.NEWPROCESSOR;
xsl := DBMS_XSLPROCESSOR.NEWSTYLESHEET(
'<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">'||
'<xsl:output method="xml" indent="yes"/>'||
'<xsl:template match="#*|node( )">'||
'<xsl:copy>'||
'<xsl:apply-templates select="#*|node( )"/>'||
'</xsl:copy>'||
'</xsl:template>'||
'</xsl:stylesheet>', NULL);
xmlDocFragment := DBMS_XSLPROCESSOR.PROCESSXSL(p => xslProc, ss => xsl, cl => xmlString);
DBMS_XMLDOM.WRITETOCLOB(DBMS_XMLDOM.MAKENODE(xmlDocFragment), xmlStringOut);
DBMS_XSLPROCESSOR.FREESTYLESHEET(xsl);
DBMS_XSLPROCESSOR.FREEPROCESSOR(xslProc);
xmlString := xmlStringOut;
DBMS_LOB.FREETEMPORARY(xmlStringOut);
END MakePrettyXml;
But note the output is a CLOB rather than a XMLTYPE, you may need some additional conversions.

Related

Why does Replace '&' with '&' not work for XML data?

I need to download a XML file and its data is retrieved from stored procedure.
My problem is if the data contains any '&' symbol, in XML file it is showing as
'&'
I have used REPLACE function in my Procedure as shown below but...
SELECT #V_NAME = REPLACE(#V_NAME, ' & ', ' & ');
UPDATE #TMP_RS_XML
SET OBJECT_ID=#V_ID,
FNAME=#V_FILE,
DOCUMENT=(SELECT #V_NAME as 'Description',
...
Now, the output is:
&amp;
This is not the way this is supposed to work...
XML is not just some text with fancy extras but with very strict rules. As any text-based container you will need either magic words or special characters to tell the consumer what is the content and what is the markup.
The most important markup characters in XML are < and > - of course. If you want these characters to be part of your content, you'll have to replace them. That is done with xml entities.
Within the content, any XML entity will start with an ampersand (< comes out as <), therefore the ampersand is the third most important special character. If you want an ampersand within the content you must use an entitiy (&) as a code for in this place we want an ampersand.
You must distinguish between the text you see, when you look at the XML and the actual content taken out of the XML.
Try this:
DECLARE #SomeStringWithSpecialCharacters NVARCHAR(200)=N'This & that -> let''s see, why how some foreign characters behave: அரிச். And what about a line break?' + CHAR(13) + CHAR(10) + 'Here is the second line. And an unprintable?' + CHAR(2);
--Here we use FOR XML, all the escaping is done implicitly
SELECT #SomeStringWithSpecialCharacters AS TestIt FOR XML PATH('test');
The result
<test>
<TestIt>This & that -> let's see, why how some foreign characters behave: அரிச். And what about a line break?
Here is the second line. And an unprintable?</TestIt>
</test>
Now I take the XML as it came out of the first part and place it into a XML-typed variable.
Attention: I had to remove the  entity, check it out...
DECLARE #SomeXML XML=
N'<test>
<TestIt>This & that -> let''s see, why how some foreign characters behave: அரிச். And what about a line break?
Here is the second line. And an unprintable?</TestIt>
</test>';
--Now we do the magic using .value() against a native XML:
SELECT #SomeXML.value('(/test/TestIt/text())[1]','nvarchar(max)');
The result comes out with all entities re-espaced:
This & -> let's see, why how some foreign characters behave: அரிச். And what about a line break?
Here is the second line. And an unprintable?
The general hint is: Never do the replacements yourself. Pushing content into the XML will need escaping and reading content out of XML will need the opposite. All this is done for you implicitly, when you use the proper tools.
'&' is a special character that is being rendered out of ' &amp ; '
The best practice here would be to decode the XML, adding a reference below:
https://learn.microsoft.com/en-us/dotnet/api/system.web.httputility.htmldecode?redirectedfrom=MSDN&view=netframework-4.8#overloads

Can SSMS indent xml when pasting into Editor?

Does SSMS - SQL Server 2014 have an option to automatically indent XML text?
I save XML text in a column (nvarchar(max)) to analyze the input of an application.
Usually the result of my queries are set to grid and I copy and paste the result into the query editor to read it.
This is what I get:
<?xml version="1.0"?><farm-confirm source="orders.company.com"><Detail><item_keyid>3207890</item_keyid><item_code>50002035</item_code></Detail></farm-confirm>
This is what I would like:
<?xml version="1.0"?>
<farm-confirm source="orders.company.com">
<Detail>
<item_keyid>3207890</item_keyid>
<item_code>50002035</item_code>
</Detail>
</farm-confirm>
Thanks
Given the XML is well formed, the easies was to do this:
DECLARE #xml XML=N'Put your XML here';
SELECT #xml;
(Output to Grid-View)
And now just click on the XML. The XML-Viewer will present it formatted and indented.
Or take one of the free online XML prettyzizers.
Just google for online pretty xml formatter
update
If you get the XML (which is - in your case - a string actually) from a query, you might just wrap the column with CAST(MyColumn AS XML). This will offer you the XML-Viewer immediately...

How to not escape special chars when updating XML in oracle SQL

I have a problem trying to update xmlType values in oracle.
I need to modify the xml looking similar to the following:
<a>
<b>Something to change here</b>
<c>Here is some narrative containing weirdly escaped <tags>\</tags> </c>
</a>
What I want to achieve is to modify <b/> without modifying <c/>
Unfortunately following modifyXml:
select
updatexml(XML_TO_MODIFY, '/a/b/text()', 'NewValue')
from dual;
returns this:
<a>
<b>NewValue</b>
<c>Here is some narrative containing weirdly escaped <tags></tags> </c>
</a>
as you can see, the '>' had been escaped.
Same happens for xmlQuery (the new non-deprecated version of updateXml):
select /*+ no_xml_query_rewrite */
xmlquery(
'copy $d := .
modify (
for $i in $d/a
return replace value of node $i/b with ''nana''
)
return $d'
passing t.xml_data
returning content
) as updated_doc
from (select xmlType('<a>
<b>Something to change here</b>
<c>Here is some narrative containing weirdly escaped \<tags>\</tags> </c>
</a>') as xml_data from dual) t
;
Also when using xmlTransform I will get the same result.
I tried to use the
disable-output-escaping="yes"
But it did the opposite - it unescaped the < :
select XMLTransform(
xmlType('<a>
<b>Something to change here</b>
<c>Here is some narrative containing weirdly escaped \<tags>\</tags> </c>
</a>'),
XMLType(
'<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/a/b">
<b>
<xsl:value-of select="text()"/>
</b>
</xsl:template>
<xsl:template match="/a/c">
<c>
<xsl:value-of select="text()" disable-output-escaping="yes"/>
</c>
</xsl:template>
</xsl:stylesheet>'))
from dual;
returned:
<a>
<b>NewValue</b>
<c>Here is some narrative containing weirdly escaped <tags></tags> </c>
</a>
Any suggestions?
Two things you need to know:
I cannot modify the initial format - it comes to me in this way and
I need to preserve it.
The original message is so big, that changing
the message to string and back (to use regexps as workaround) will
not do the trick.
The root of your issue seems to be that your original XML value for node C is not valid XML if it contains the > within the value instead of >, and not inside a CDATA section (also What does <![CDATA[]]> in XML mean?).
The string value of:
Here is some narrative containing weirdly escaped <tags>\</tags>
in XML format should really be
<c>Here is some narrative containing weirdly escaped &lt;tags>\&lt;/tags></c>
OR
<c><![CDATA[Here is some narrative containing weirdly escaped <tags>\</tags>]]></c>
I would either request that the XML be corrected at the source, or implement some method to sanitize the inputs yourself, such as wrapping the <c> node values in <![CDATA[]]>. If you need to save the exact original value, and the messages are large, then the best I can think of is the store duplicate copies, with the original value as string, and store the "sanitized" value as XML data type.
In the end we managed to do this with the help of java.
By:
reading the xml as a clob
modifying it in java
storing it back in the database using java.sql.Connection (for some reason, if we used
JdbcTemplate, it complained about casting to Long, which was
indication that string was over 4000 bytes (talking about clean
errors, all hail Oracle) and using CLOB Type didn't really
help. I guess it's a different story though)
When storing the data, oracle does not perform any magic, only updates tend to modify escape characters.
Possibly not an answer for everyone, but a nice workaround if you stumble upon same problem as we did.

Trimming spaces out of a string based on a pattern in SQL Server

I have a varchar(max) field with XML data in it. I need to clean it by removing the spaces between the tags. For eg:
</tns:time_changed> <tns:changed_properties>
should be cleaned as
</tns:time_changed><tns:changed_properties>
I need to do this in a single query and I cannot use replace all white spaces as there are other relevant spaces in the content.
Try like this:
UPDATE table
SET xmlColumnName = REPLACE ( xmlColumnName , '> <' , '><' );
Converted to XML type and it automatically took care of the spaces.
Replaced the
<?xml version="1.0"?>
from the field with a blank and it got rid of the error that I was getting "text/xmldecl not at the beginning of input".

escaping special characters in sql for generating xml file

I am using SQL server 2008 and I'm quite new to writing sql. My aim is to export data from a table into xml format to create a CAP xml file that can be used in our website. Currently, I'm just writing some select statements to retrieve data in the correct format. Here is the code:
select (SELECT TOP 5 [Master_Incident_Number] AS incidents
,[Jurisdiction] AS jurisdiction
,[Response_Date] AS Date
FROM [ESCAD_DW_System].[dbo].[CurrentIncidents_V] Incident
FOR XML PATH ('area'), type ) AS Alert for xml path (''),
ROOT ('?xml version = "1.0" encoding = "UTF-8"?')
However, I am getting 'invalid XML identifier' error for '?' symbol. Can anyone help?
You cannot use ROOT to add an xml encoding tag. The only way to do that is to convert the xml output into varchar(max) and prepend with your encoding tag. Keep in mind though that FOR XML output is UTF-16 by default, and therefore your UTF-8 encoding is probably unnecessary. Having said that, here is a simple example that uses a UDF to convert the xml into varchar(max):
create function gimmexml()
returns varchar(max)
as
begin
return (
select a='some', b='xml'
for xml path ('')
)
end
go
select '<?xml version="1.0" encoding="UTF-8"?>'+dbo.gimmexml();
Result:
<?xml version="1.0" encoding="UTF-8"?><a>some</a><b>xml</b>
Further reading:
http://www.devnewsgroups.net/group/microsoft.public.sqlserver.xml/topic60022.aspx
http://msdn.microsoft.com/en-us/library/ms345137%28v=sql.90%29.aspx