XML parsing error for amersand in CDATA section - cdata

CDATA section is used to treat character '&' and '<' as normal text string but my XML is failing to parse due to '&' in CDATA section.
It says SAXParser exception with lex code 3 invalid character ' ' found.
Can you please suggest how to correct this?

<![CDATA[xxxxxx > 0]]>
its that using my project CDATA TYPE. I think is useful for you.

Related

WIX How to include the equal signs and ampersand in the string table to avoid LGHT0104 error

I have a launch condition error string in String_en-US.wxl:
<WixLocalization Culture="en-us" Codepage="1252" xmlns="http://schemas.microsoft.com/wix/2006/localization">
<String Id="ERR_REQUIRED_APP_ABSENT">This product requires XXX to be on the system. Please download it from "https://knowledge.xxx.com/knowledge/llisapi.dll?func=ll&objId=59284919&objAction=browse&sort=name&viewType=1", install it and try again.</String>
</WixLocalization>
It seems having the ampersand signs (&) and the equal signs (=) cause the light error:
Strings_en-US.wxl(0,0): error LGHT0104: Not a valid localization file; detail: '=' is an unexpected token. The expected token is ';'. Line 36, position 172.
I even tried to escape them using = which is equivalent to the equal sign but it complaint about the ampersand. "How can I avoid the error?
CDATA: A CDATA section is "...a section of element content that is marked for the parser to interpret as only character data, not markup."
In this case, something like this:
<String Id="TEST1"><![CDATA[https://www.hi.com/one&two&three&v=1]]></String>
XML Escape Characters: XML escape characters are normally used for encoding special characters in XML documents. The escape character for & is & & (more) - CDATA is an alternative approach.
Links:
What characters do I need to escape in XML documents?
https://en.wikipedia.org/wiki/CDATA

Add some rule for XML parsing

Add some rule for parsing of the XML, but it doesn't work when I try compile project. I have the next error:
Error 2 unknown attribute reference 'closeTag' in '$closeTag.text' D:\DevExpress\ControlEvaluation\RichEditControl\WindowsFormsRichEdit\WindowsFormsRichEdit\XMLParser.g4 40 29 WindowsFormsRichEdit
Error 1 unknown attribute reference 'openTag' in '$openTag.text' D:\DevExpress\ControlEvaluation\RichEditControl\WindowsFormsRichEdit\WindowsFormsRichEdit\XMLParser.g4 40 8 WindowsFormsRichEdit
element : '<' openTag=Name attribute* '>' content '<' '/' closeTag=Name '>'
| {$openTag.text.equals($closeTag.text)}?
| '<' Name attribute* '/>'
;
The closeTag is defined in your first alternative, but you refer to it in the second alternative. It doesn't exist there.
Don't do semantic checks in the parser. The equality of open and close tag names is a semantic enforcement. Instead parse the input without such constraints like you want to implement here and instead run a semantic phase once you got the parse tree. This will allow you also to print much better error messages (e.g. "Open and close tag must be the same", instead of "No viable alt").
For this semantic check use the generated parse tree listener (or rather your derivation of that class).

Why does Replace '&' with '&' not work for XML data?

I need to download a XML file and its data is retrieved from stored procedure.
My problem is if the data contains any '&' symbol, in XML file it is showing as
'&'
I have used REPLACE function in my Procedure as shown below but...
SELECT #V_NAME = REPLACE(#V_NAME, ' & ', ' & ');
UPDATE #TMP_RS_XML
SET OBJECT_ID=#V_ID,
FNAME=#V_FILE,
DOCUMENT=(SELECT #V_NAME as 'Description',
...
Now, the output is:
&amp;
This is not the way this is supposed to work...
XML is not just some text with fancy extras but with very strict rules. As any text-based container you will need either magic words or special characters to tell the consumer what is the content and what is the markup.
The most important markup characters in XML are < and > - of course. If you want these characters to be part of your content, you'll have to replace them. That is done with xml entities.
Within the content, any XML entity will start with an ampersand (< comes out as <), therefore the ampersand is the third most important special character. If you want an ampersand within the content you must use an entitiy (&) as a code for in this place we want an ampersand.
You must distinguish between the text you see, when you look at the XML and the actual content taken out of the XML.
Try this:
DECLARE #SomeStringWithSpecialCharacters NVARCHAR(200)=N'This & that -> let''s see, why how some foreign characters behave: அரிச். And what about a line break?' + CHAR(13) + CHAR(10) + 'Here is the second line. And an unprintable?' + CHAR(2);
--Here we use FOR XML, all the escaping is done implicitly
SELECT #SomeStringWithSpecialCharacters AS TestIt FOR XML PATH('test');
The result
<test>
<TestIt>This & that -> let's see, why how some foreign characters behave: அரிச். And what about a line break?
Here is the second line. And an unprintable?</TestIt>
</test>
Now I take the XML as it came out of the first part and place it into a XML-typed variable.
Attention: I had to remove the  entity, check it out...
DECLARE #SomeXML XML=
N'<test>
<TestIt>This & that -> let''s see, why how some foreign characters behave: அரிச். And what about a line break?
Here is the second line. And an unprintable?</TestIt>
</test>';
--Now we do the magic using .value() against a native XML:
SELECT #SomeXML.value('(/test/TestIt/text())[1]','nvarchar(max)');
The result comes out with all entities re-espaced:
This & -> let's see, why how some foreign characters behave: அரிச். And what about a line break?
Here is the second line. And an unprintable?
The general hint is: Never do the replacements yourself. Pushing content into the XML will need escaping and reading content out of XML will need the opposite. All this is done for you implicitly, when you use the proper tools.
'&' is a special character that is being rendered out of ' &amp ; '
The best practice here would be to decode the XML, adding a reference below:
https://learn.microsoft.com/en-us/dotnet/api/system.web.httputility.htmldecode?redirectedfrom=MSDN&view=netframework-4.8#overloads

Insert special character XML to SQL

I'm trying to update a column of type XML.
Text to be inserted in the XML fields: "& Decision ↨‼ Agreement"
Text converted to XML: <?xml version="1.0" encoding="utf-16"?><Informations xmlns="http://monschema"><Text lGic="fdf475bc-9fed-4f61-b321-f81949cb51ca" id="71e231e6-ecbd-4848-ba6f-004bdddefb79">& Décision   Accord</Text></Informations>
Error: Msg 9420, Level 16, State 1, Line 7
XML parsing: line 1, character 263 character non-compliant XML
I do not understand why the character with ascii code "&#x12" has a problem.
If I replace &#x12 by &#x20, it works !
Can you help me?
Thank you in advance
The character references  and  denote control characters that are disallowed in XML 1.0. The real problem here is that they do not denote the characters you have in the text. The characters “↨‼” are U+21A8 UP DOWN ARROW WITH BASE and U+203C DOUBLE EXCLAMATION MARK, so they should be written as ↨‼.
The reason why get the odd character references is probably that in the CP437 encoding, “↨‼” are placed in code positions 12 and 13 (hex.). So this is an encoding confusion, and some conversion has applied a wrong conversion. In XML, the numbers in character references always mean Unicode code numbers.
These control characters are not supported in XML version 1.0 documents.
You should be able to change your version to 1.1 in the version attribute of the document, in which case the document should validate.
I solved my problem.
This character is from a SQL obtenues view on ORACLE database.
The character -> on ORACLE Is interpreted by ↨ on SQL SERVER.
I'll do a replace in my view

XML parse for special characters containing in Element

I want to read the below XML using XMLREADER.
<?xml version="1.0" encoding="ISO-8859-1" ?>
<Inforamtion>
<Name;Property>Name contain
</Name;Property>
<123>89</123>
<question?>
</question?>
</Inforamtion>
But it throws me error for special character containing in element name.
and element name's first char can't be a number.
I can have any such xml in bulk to process and correct it.
Please guide me how to process such XML or correct it or read it?
Thank You
Your xml is not valid.
This document :
What are the rules for a valid XML element name? will help you correct this XML.
A summary :
A Name is a token beginning with a letter or one of a few punctuation
characters ( [_] and [:]) , and continuing with letters, digits, hyphens, underscores,
colons, or full stops, together known as name characters.