Convert an XML UTF-8 encoded string to XML datatype in SQL Server - sql

Converting an XML string using CAST( AS XML) works as expected in many scenarios, but fail with an error "illegal xml character" if the string contains accented chars.
This example fails with error "XML parsing: line 2, character 8, illegal xml character":
declare #Text VARCHAR(max) =
'<?xml version="1.0" encoding="UTF-8"?>
<ROOT>níveis porém alocação</ROOT>'
select CAST(#Text AS XML)
According to XML Specification all of them are legal XML chars, but replacing accented chars with an 'X' char will result in a sucessfull CAST:
declare #MessageText VARCHAR(max) =
'<?xml version="1.0" encoding="UTF-8"?>
<ROOT>nXveis porXm alocaXXo</ROOT>'
select CAST(#MessageText AS XML)
Result: <ROOT>nXveis porXm alocaXXo</ROOT>
Moreover, the same XML but UTF-16 encoded, inexplicably works:
declare #MessageText NVARCHAR(max) =
'<?xml version="1.0" encoding="UTF-16"?>
<ROOT>níveis porém alocação</ROOT>'
select CAST(#MessageText AS XML)
Result: <ROOT>níveis porém alocação</ROOT>
Are those chars illegal in UTF-8? Or there is a better way to convert into an XML datatype?

SQL Server strips any XML Declaration prolog internally for XML data type and uses UTF-16 encoding. Here is how to handle correctly your use case.
SQL
-- Method #1
DECLARE #Text NVARCHAR(MAX) = N'<ROOT>níveis porém alocação</ROOT>';
SELECT CAST(#Text AS XML);
-- Method #2
DECLARE #MessageText NVARCHAR(MAX) =
'<?xml version="1.0" encoding="UTF-16"?>
<ROOT>níveis porém alocação</ROOT>';
SELECT CAST(#MessageText AS XML);

Related

Error "The name "soap" does not denote a namespace" if i try to parse XML in T-sql

When i try to parse XML in T-sql I get error "The name "soap" does not denote a namespace"
This is my XML and query. Answer, please, how can I parse such XML. Thanks.
DECLARE #xml xml
SET #xml =
'<soap:Envelope xmlns:soap=http://schemas.xmlsoap.org/soap/envelope/>
<soap:Body>
<GetWebLinkResponse xmlns=http://test.org/ xmlns:ns2=http://schemas.datacontract.org/2004/07/.Configuration xmlns:ns3=http://schemas.datacontract.org/2004/07/.Configuration.Results xmlns:ns4=http://schemas.microsoft.com/2003/10/Serialization/>
<GetWebLinkResult>
<ns3:Error>
<ns2:Id>0</ns2:Id>
<ns2:Text>Success.</ns2:Text>
</ns3:Error>
<ns3:Url>
https://example.com&content=true
</ns3:Url>
</GetWebLinkResult>
</GetWebLinkResponse>
</soap:Body>
</soap:Envelope>'
SELECT
-- n.value('(./Code/text())[1]','int') as CODE
--,
n.value('ns3:Url','varchar(1000)')as NAME
FROM #xml.nodes('/soap:Envelope/soap:Body/GetWebLinkResponse/GetWebLinkResult') as a(n)
First of all - your XML is invalid - your namespace definitions must be in double quotes:
DECLARE #xml xml
SET #xml =
'<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/">
<soap:Body>
<GetWebLinkResponse xmlns="http://test.org/" xmlns:ns2="http://schemas.datacontract.org/2004/07/.Configuration" xmlns:ns3="http://schemas.datacontract.org/2004/07/.Configuration.Results" xmlns:ns4="http://schemas.microsoft.com/2003/10/Serialization/">
<GetWebLinkResult>
<ns3:Error>
<ns2:Id>0</ns2:Id>
<ns2:Text>Success.</ns2:Text>
</ns3:Error>
<ns3:Url>
https://example.com&content=true
</ns3:Url>
</GetWebLinkResult>
</GetWebLinkResponse>
</soap:Body>
</soap:Envelope>';
Once you've corrected that, you'll need to define all relevant XML namespaces for your T-SQL query. Relevant XML namespaces are all explicitly used namespaces (like soap: and ns3:), as well as default namespaces like the xmlns="http://test.org/" on your <GetWebLinkResponse> node.
When you declare them all - the T-SQL query should work:
WITH XMLNAMESPACES('http://schemas.xmlsoap.org/soap/envelope/' AS soap,
'http://schemas.datacontract.org/2004/07/.Configuration.Results' as ns3,
DEFAULT 'http://test.org/')
SELECT
n.value('(ns3:Url)[1]', 'varchar(1000)') AS NAME
FROM
#xml.nodes('/soap:Envelope/soap:Body/GetWebLinkResponse/GetWebLinkResult') as a(n)

How to add additional XML node on top of my SQL generated XML

I have generated XML from a SQL Server FOR XML PATH statement as shown here:
USE MySQLDB
SELECT *
FROM BillTable
FOR XML PATH ('BillAdd'), ROOT ('BillAddRq')
And this is the result:
<BillAddRq>
<BillAdd>
<TxnID>2432-1071510295</TxnID>
<TimeCreated>2003-12-16T01:44:55</TimeCreated>
<TimeModified>2015-12-15T22:38:33</TimeModified>
<EditSequence>1450190313</EditSequence>
<TxnNumber>413</TxnNumber>
<VendorRef_ListID>E0000-933272656</VendorRef_ListID>
<VendorRef_FullName>Timberloft Lumber</VendorRef_FullName>
<APAccountRef_ListID>C0000-933270541</APAccountRef_ListID>
<APAccountRef_FullName>Accounts Payable</APAccountRef_FullName>
<TxnDate>2016-12-01T00:00:00</TxnDate>
<DueDate>2017-12-31T00:00:00</DueDate>
<AmountDue>80.50000</AmountDue>
<TermsRef_ListID>50000-933272659</TermsRef_ListID>
<TermsRef_FullName>1% 10 Net 30</TermsRef_FullName>
<IsPaid>0</IsPaid>
</BillAdd>
<BillAdd>
<TxnID>243A-1071510389</TxnID>
<TimeCreated>2003-12-16T01:46:29</TimeCreated>
<TimeModified>2015-12-15T22:38:33</TimeModified>
<EditSequence>1450190313</EditSequence>
<TxnNumber>414</TxnNumber>
<VendorRef_ListID>C0000-933272656</VendorRef_ListID>
<VendorRef_FullName>Perry Windows & Doors</VendorRef_FullName>
<APAccountRef_ListID>C0000-933270541</APAccountRef_ListID>
<APAccountRef_FullName>Accounts Payable</APAccountRef_FullName>
<TxnDate>2016-12-02T00:00:00</TxnDate>
<DueDate>2018-01-01T00:00:00</DueDate>
<AmountDue>50.00000</AmountDue>
<TermsRef_ListID>10000-933272658</TermsRef_ListID>
<TermsRef_FullName>Net 30</TermsRef_FullName>
<IsPaid>0</IsPaid>
</BillAdd>
</BillAddRq>
Now, I'd like to encapsulate the above with these nodes:
<?xml version="1.0" encoding="utf-8"?>
<?qbxml version="15.0"?>
<QBXML>
<QBXMLMsgsRq onError="stopOnError">
//above generated xml//
</QBXMLMsgsRq>
</QBXML>
How will I achieve this in a SQL Query I created above?
I am new to SQL Server and XML. I am trying to generate this XML directly from my database and vice versa to make it more efficient and faster — let my SQL directly communicate with XML.
ATTEMPT 1:
USE MySQLDB;
GO
DECLARE #myDoc XML;
SET #myDoc = '<QBXML>
<QBXMLMsgsRq onError="stopOnError">
</QBXMLMsgsRq>
</QBXML>';
SET #myDoc.modify('
insert
-- instead of inserting string here.. I would like to insert here the query I made above
into (/QBXML/QBXMLMsgsRq)[1]');
SELECT #myDoc;
ATTEMPT 2:
USE MySQLDB;
GO
DECLARE #myDoc XML;
SET #myDoc = '<QBXML>
<QBXMLMsgsRq onError="stopOnError">
</QBXMLMsgsRq>
</QBXML>';
DECLARE #qry XML;
SET #qry = (SELECT * FROM BillTable FOR XML PATH ('BillAdd'), ROOT ('BillAddRq'));
-- SELECT #qry;
SET #myDoc.modify('insert #qry
into (/QBXML/QBXMLMsgsRq)[1]');
SELECT #myDoc;
There are many ways to construct your XML result, consider the following three alternatives...
Use XML.modify() to insert the BillTable XML into an XML scalar variable (which includes the ?qbxml XML processing instruction):
declare #BillTableXml xml = (
select *
from BillTable
for xml path('BillAdd'), root('BillAddRq')
);
declare #myDoc xml = '<?xml version="1.0" encoding="utf-8"?>
<?qbxml version="15.0"?>
<QBXML>
<QBXMLMsgsRq onError="stopOnError">
</QBXMLMsgsRq>
</QBXML>';
set #myDoc.modify('
insert sql:variable("#BillTableXml")
into (/QBXML/QBXMLMsgsRq)[1]
');
select #myDoc as Result;
Use a nested query to construct the entire XML result (which does not, however, include the ?qbxml XML processing instruction):
select
'stopOnError' as [QBXML/QBXMLMsgsRq/#onError],
(
select *
from BillTable
for xml path('BillAdd'), root('BillAddRq'), type
) as [QBXML/QBXMLMsgsRq]
for xml path('');
Or use an XQuery to construct the entire XML result (which also includes the ?qbxml XML processing instruction):
select BillTableXml.query('
<?qbxml version="15.0"?>,
<QBXML>
<QBXMLMsgsRq onError="stopOnError">
{ /BillAddRq }
</QBXMLMsgsRq>
</QBXML>
') as Result
from (
select *
from BillTable
for xml path('BillAdd'), root('BillAddRq'), type
) Data (BillTableXml);

SSMS Obfuscation of string

I need to create a script to obfuscate some data.
The string is quite long and I need to obfuscate only some parts of it.
In the table the records are similar to this:
<?xml version="1.0" encoding="UTF-8"?><CONTRACT><IBC IBC_REF="f45f1231234ae5ac2easdasdfde5dfd" IBC_TYPE="I" TELEPHONE_1="1111111" TELEPHONE_2="11111111" MOBILE_PHONE="11111111" E_MAIL="asdasdasd#hotmail.com" SOLICITATION_MAIL="0" ARREARS_MAIL="1" MAIL_REDIRECTED="0" TITLE="Mrs" SURNAME_REGISTERED_NAME="Assadasd"
And it needs to become like this:
<?xml version="1.0" encoding="UTF-8"?><CONTRACT><IBC IBC_REF="f45f1231234ae5ac2easdasdfde5dfd" IBC_TYPE="I" TELEPHONE_1="Telephone-1" TELEPHONE_2="Telephone-2" MOBILE_PHONE="MobilePhone" E_MAIL="email-1" SOLICITATION_MAIL="0" ARREARS_MAIL="1" MAIL_REDIRECTED="0" TITLE="Mrs" SURNAME_REGISTERED_NAME="Surname"
How can I update all the rows of the table and change only some of the strings by saving the other words?
Seemingly you are working with XML strings, so I would suggest to use the SQL Server XML functionalities. Following a short example:
DECLARE #input NVARCHAR(4000) = '<?xml version="1.0" encoding="UTF-8"?><CONTRACT><IBC IBC_REF="f45f1231234ae5ac2easdasdfde5dfd" IBC_TYPE="I" TELEPHONE_1="1111111" TELEPHONE_2="11111111" MOBILE_PHONE="11111111" E_MAIL="asdasdasd#hotmail.com" SOLICITATION_MAIL="0" ARREARS_MAIL="1" MAIL_REDIRECTED="0" TITLE="Mrs" SURNAME_REGISTERED_NAME="Assadasd" /></CONTRACT>';
DECLARE #x xml = CONVERT(xml, REPLACE(#input,'encoding="UTF-8"','encoding="UTF-16"'));
SELECT #x.value('(/CONTRACT/IBC/#TELEPHONE_1)[1]', 'nvarchar(100)');
DECLARE #y xml = #x
SET #y.modify('replace value of (/CONTRACT/IBC/#TELEPHONE_1)[1] with "TELEPHONE_1"');
SELECT #y.value('(/CONTRACT/IBC/#TELEPHONE_1)[1]', 'nvarchar(100)');
SELECT #y
DECLARE #output NVARCHAR(4000) = '<?xml version="1.0" encoding="UTF-8"?>' + CONVERT(NVARCHAR(4000), #y)
SELECT #output

T-SQL Code to edit XML

In the XML text below, I want to write an update script to modify the date part 2013-12-30T04:30:00.000+00:00 with dateadd(minute, 2, getdate())
How do I get the format as represented in the XML text.
What would be the best way to concatenate the date?
<ScheduleDefinition xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<StartDateTime xmlns="http://schemas.microsoft.com/sqlserver/2006/03/15/reporting/reportingservices">2013-12-30T04:30:00.000+00:00</StartDateTime>
<WeeklyRecurrence xmlns="http://schemas.microsoft.com/sqlserver/2006/03/15/reporting/reportingservices">
<WeeksInterval>1</WeeksInterval>
<DaysOfWeek>
<Sunday>true</Sunday>
<Monday>true</Monday>
<Tuesday>true</Tuesday>
<Wednesday>true</Wednesday>
<Thursday>true</Thursday>
<Friday>true</Friday>
<Saturday>true</Saturday>
</DaysOfWeek>
</WeeklyRecurrence>
</ScheduleDefinition>
This:-
declare #xml varchar(max)
set #xml='<ScheduleDefinition xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<StartDateTime xmlns="http://schemas.microsoft.com/sqlserver/2006/03/15/reporting/reportingservices">2013-12-30T04:30:00.000+00:00</StartDateTime>
<WeeklyRecurrence xmlns="http://schemas.microsoft.com/sqlserver/2006/03/15/reporting/reportingservices">
<WeeksInterval>1</WeeksInterval>
<DaysOfWeek>
<Sunday>true</Sunday>
<Monday>true</Monday>
<Tuesday>true</Tuesday>
<Wednesday>true</Wednesday>
<Thursday>true</Thursday>
<Friday>true</Friday>
<Saturday>true</Saturday>
</DaysOfWeek>
</WeeklyRecurrence>
</ScheduleDefinition>'
declare #pos int
set #pos=charindex('</StartDateTime>',#xml)
select left(#xml,#pos-30)+
convert(varchar(23),dateadd(minute,2,sysutcdatetime()),126)+'+00:00'+
substring(#xml,#pos,datalength(#xml))
Returns:-
<ScheduleDefinition xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<StartDateTime xmlns="http://schemas.microsoft.com/sqlserver/2006/03/15/reporting/reportingservices">2014-02-19T21:27:36.348+00:00</StartDateTime>
<WeeklyRecurrence xmlns="http://schemas.microsoft.com/sqlserver/2006/03/15/reporting/reportingservices">
<WeeksInterval>1</WeeksInterval>
<DaysOfWeek>
<Sunday>true</Sunday>
<Monday>true</Monday>
<Tuesday>true</Tuesday>
<Wednesday>true</Wednesday>
<Thursday>true</Thursday>
<Friday>true</Friday>
<Saturday>true</Saturday>
</DaysOfWeek>
</WeeklyRecurrence>
</ScheduleDefinition>
It uses charindex() to find a consistent piece of text (</StartDateTime>) in your xml. It then uses left() to cut off the start of the xml (truncating off the current date). It then uses sysutcdatetime() to get the current time of your server expressed in UTC time (so that later a consistent timezone offset of +00:00 can be applied). It then uses convert() with a style of 126 to convert the time into the format required for your xml. It then uses substring() and datalength() to add the end of your xml (the length does not need to be exact).
Hopefully, this will give you some ideas about how to go about cutting up your xml to substitute the date that you want.
DATETIMEOFFSET datatype and XML.modify(). I would not recommend treating XML as text if there is a way to do it using XML. Your namespaces make it a little tricky. This works on SQL Server 2008 R2.
DECLARE #XML XML
DECLARE #NEWVALUE DATETIMEOFFSET
set #XML='<ScheduleDefinition xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<StartDateTime xmlns="http://schemas.microsoft.com/sqlserver/2006/03/15/reporting/reportingservices">2013-12-30T04:30:00.000+00:00</StartDateTime>
<WeeklyRecurrence xmlns="http://schemas.microsoft.com/sqlserver/2006/03/15/reporting/reportingservices">
<WeeksInterval>1</WeeksInterval>
<DaysOfWeek>
<Sunday>true</Sunday>
<Monday>true</Monday>
<Tuesday>true</Tuesday>
<Wednesday>true</Wednesday>
<Thursday>true</Thursday>
<Friday>true</Friday>
<Saturday>true</Saturday>
</DaysOfWeek>
</WeeklyRecurrence>
</ScheduleDefinition>'
;WITH XMLNAMESPACES('http://schemas.microsoft.com/sqlserver/2006/03/15/reporting/reportingservices' AS X)
SELECT #NEWVALUE = TableAlias.FieldAlias.value('(X:StartDateTime/text())[1]', 'DATETIMEOFFSET') FROM #XML.nodes('//ScheduleDefinition') AS TableAlias(FieldAlias)
SET #NEWVALUE = DATEADD(MINUTE,2,#NEWVALUE)
DECLARE #FORMAT VARCHAR(max) = CONVERT(VARCHAR(MAX),#NEWVALUE,126)
SET #XML.modify('
declare namespace x="http://schemas.microsoft.com/sqlserver/2006/03/15/reporting/reportingservices";
replace value of (/ScheduleDefinition/x:StartDateTime[1]/text())[1]
with sql:variable("#FORMAT")
')
SELECT #XML

adding encoding information to the result of FOR XML [duplicate]

This question already has answers here:
SQL Server FOR XML Enclosing Element?
(2 answers)
Closed 7 years ago.
I have a Script, which returns a XML using FOR XML in SQL 2008. Is there any way to add the version and encoding information in the beginning of the output. Eventually, i am planning to save the output in a file.
For example, right now my output looks like this
<Agents>
<Agent id="1">
<Name>Mike</Name>
<Location>Sanfrancisco</Location>
</Agent>
<Agent id="2">
<Name>John</Name>
<Location>NY</Location>
</Agent>
</Agents>
I would like to append the line <?xml version="1.0" encoding="UTF-8"?> in the beginning of the Xml output
So i want the output something like
<?xml version="1.0" encoding="UTF-8"?>
<Agents>
<Agent id="1">
<Name>Mike</Name>
<Location>Sanfrancisco</Location>
</Agent>
<Agent id="2">
<Name>John</Name>
<Location>NY</Location>
</Agent>
As #gbn points out in another answer and on another question, "the XML data is stored internally as ucs-2", and SQL Server doesn't include it when producing the data. However, you can convert the XML to a string and append the XML declaration at the beginning manually. However, simply using UTF-8 in the declaration would be inaccurate. The Unicode string which SQL produces is in UCS-2. For example, this will fail:
SELECT CONVERT(xml,N'<?xml version="1.0" encoding="UTF-8"?>' + CONVERT(NVARCHAR(MAX),CONVERT(XML,N'<x>' + NCHAR(10176) + N'</x>')));
with error:
Msg 9402, Level 16, State 1, Line 1 XML parsing: line 1, character 38,
unable to switch the encoding
This, on the other hand, will work as expected:
SELECT CONVERT(xml,N'<?xml version="1.0" encoding="UCS-2"?>' + CONVERT(NVARCHAR(MAX),CONVERT(XML,N'<x>' + NCHAR(10176) + N'</x>')));
Here is code which will produce the full, declaration-laden XML string you seek for your example data:
DECLARE #Agents TABLE
(
AgentID int,
AgentName nvarchar(50),
AgentLocation nvarchar(100)
);
INSERT INTO #Agents (AgentID, AgentName, AgentLocation) VALUES (1, N'Mike', N'Sanfrancisco');
INSERT INTO #Agents (AgentID, AgentName, AgentLocation) VALUES (2, N'John', N'NY');
WITH BaseData AS
(
SELECT
(
SELECT
AgentID AS '#id',
AgentName AS 'Name',
AgentLocation AS 'Location'
FROM #Agents
FOR XML PATH('Agent'), ROOT('Agents'), TYPE
) AS AgentXML
), FullStringTable AS
(
SELECT
*,
'<?xml version="1.0" encoding="UCS-2"?>' +
CONVERT(nvarchar(max),AgentXML) AS FullString
FROM BaseData
)
SELECT
AgentXML AS OriginalXML,
FullString,
CONVERT(xml,FullString) AS FullStringConvertedToXML
FROM FullStringTable;
SQL Server internally always uses utf-16 ucs-2 so you could just append it like we did. That is, SQL Server would never generate anything with "utf-8".
Edit: after some digging:
http://www.devnewsgroups.net/group/microsoft.public.sqlserver.xml/topic60022.aspx
http://forums.asp.net/t/1455808.aspx
If you will not be attempting to manipulate the results as TSQL XML then the simplest thing to do will be create a varchar with the string you wish to append then add the XML to it using a CAST statemnt to convert the XML to varchar.
declare #testXML as XML
declare #testPrefix as varchar(255)
set #testPrefix = '<?xml version="1.0" encoding="UTF-8"?>'
set #testXML = '<Agents> <Agent id="1"> <Name>Mike</Name> <Location>Sanfrancisco</Location> </Agent> <Agent id="2"> <Name>John</Name> <Location>NY</Location> </Agent></Agents>'
select #testPrefix
Select #testXML
select #testPrefix + CAST(#testXML as varchar(max))