XSLT 1 - How can I replace '&' with '&' in XML file? - xslt-1.0

I need to find solution to fix by using XSLT 1! Most of sent XML files well formatted and someone make mess by adding characters (& < >. . .). Any way to do replace this on my side? I tried XSLT 2 and Replace function does not work as I use XSLT processor from Microsoft
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:saxon="http://saxon.sf.net/"
exclude-result-prefixes="xs saxon"
version="2.0">
<xsl:param name="path" select="'file:///E:/foo.xml'"></xsl:param>
<xsl:template match="/">
<xsl:copy-of select="unparsed-text($path)"></xsl:copy-of>
<xsl:copy-of select="saxon:parse(replace(unparsed-text($path), '&', '&amp;'))"/>
</xsl:template>
</xsl:stylesheet>
Any other suggestion how to solve this issue. for example I have input XML file like:
<?xml version="1.0" encoding="UTF-8"?>
<name>Stack & Exchange</name>
And is fail on '&' character.
Please advice!
Thank you

Conclusion
Two observations
XSLT requires at least a well formed XML document at the input, so I can't use it to correct invalid XML
(it is an XML transformation language)
in order to use replace or escape invalid characters of XML on input I need to make sure that I use an XSLT 2.0 processor
(I use Microsoft processor XSLT 1.0)
I see two options
If I receive an error on input, investigate and validate manually and send back the error message. - THIS IS I TRIED TO AVOID! (Use text tools like notepad++, excel to find an issue).
write a correcting parser in the .net language to fix before loading as XML

Related

XSLT: variables and "empty" labels

I have an XML datafile containing among other things a string of arbitrarily many comma separated values. I want those values to be displayed in a web browser as a list with one value per line. So I wrote an XSLT template that takes this string, displays the first value followed by a linebreak tag (<br/>), properly name-spaced, and resources with the remainder of the string. In effect, the commas are being replaced by HTML <br/> tags.
Now, when I store the result of calling that template in a xsl:variable, and display that through xsl:value-of, then the HTML tags disappear: what is shown is the string minus the commas.
When I display the result directly by having the xsl:call-template in place of the xsl:value-of, all is fine, and the values appear in a list.
So, what's going on?
Is this behavior an implementation artifact, or is it standard XSLT?
Use xsl:copy-of instead of xsl:value-of if you want to output nodes (like your br elements), xsl:value-of creates a simple text node with the string value(s) selected.
Here is an example that shows the difference between xsl:value-of and xsl:copy-of, you will note that it is not the use of the variable with newly created br elements that makes the difference, it is simply the use of xsl:value-of that creates a text() node with the string conversion of the selection:
<xsl:stylesheet
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0">
<xsl:output method="html" indent="yes" version="5" doctype-system="about:legacy-doctype"/>
<xsl:variable name="var">Phrase 1.<br/>Phrase 2.<br/>Phrase 3.</xsl:variable>
<xsl:template match="/">
<html>
<head>
<title>.NET XSLT Fiddle Example</title>
</head>
<body>
<section>
<h1>Example 1: value-of</h1>
<xsl:value-of select="$var"/>
</section>
<section>
<h1>Example 2: copy-of</h1>
<xsl:copy-of select="$var"/>
</section>
<xsl:apply-templates select="//p"/>
<xsl:apply-templates select="//p" mode="copy-of"/>
</body>
</html>
</xsl:template>
<xsl:template match="p">
<section>
<h1>Example 1: value-of</h1>
<xsl:value-of select="."/>
</section>
</xsl:template>
<xsl:template match="p" mode="copy-of">
<section>
<h1>Example 1: copy-of</h1>
<xsl:copy-of select="."/>
</section>
</xsl:template>
</xsl:stylesheet>
https://xsltfiddle.liberty-development.net/gWmuiJy/1
Output is
Example 1: value-of
Phrase 1.Phrase 2.Phrase 3.
Example 2: copy-of
Phrase 1.
Phrase 2.
Phrase 3.
Example 1: value-of
Line 1.Line 2.Line 3.
Example 1: copy-of
Line 1.
Line 2.
Line 3.
It seems that you hit the boundaries of the RTF ("Result tree fragment"):
When you use an XML fragment to initialize a variable or a parameter, then the variable or parameter is of the
"result tree fragment" datatype. This is an XSLT 1.0 specific datatype [just like node-set, but slightly different].
A result tree fragment is equivalent to a node-set that contains just the root node.
You cannot apply operators like "/", "//" or predicate on a result tree fragments. They are only applicable for node-set datatypes.
[...]
a) In XSLT 1.0
The resolution of this is to convert the result tree fragment into a node-set. I am not aware of any oracle specific xpath extension functions that can do this trick for you.
You could use EXSLT to achieve this.
b) Use XSLT 2.0
You can code your transformations in XSLT 2.0. XSLT 2.0 deprecates ResultTreeFragments i.e. if you are modeling an XSLT 2.0 transformation, and you create a variable or a parameter that holds a tree fragment, it is implicitly a node sequence.
So without using an XSLT version greater than 1, you're out of luck. So better use XSLT-2.0 or 3.0 to solve this problem.
Is this behavior an implementation artifact, or is it standard XSLT?
It is standard for XSLT-1.0, but not for XSLT-2.0+.

Consolidate document() dependencies into single XSLT

I need to consolidate external files into a single XSLT 1.0 file as a transformation is going to be performed in memory without access to the original file structure.
I cannot figure out how to merge in the files which are read into variables.
The original variable declaration looks like this. (the parameter pLang is needed in the select statement I have trouble with):
<xsl:param name="pLang" select="'no'"/>
<xsl:variable name="moduleDoc" select="document('Headlines.xml')"/>
I have filled the moduleDoc variable with the contents of the Headlines file like this:
<xsl:param name="pLang" select="'no'"/>
<xsl:variable name="moduleDoc">
<module xmlns:mmx="http://funx" xmlns:svg="http://www.w3.org/2000/svg" xmlns:fnc="http://funx/fnc" xmlns:att="http://funx/att" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<document-merge>
<g-funcs>
<g name="IssueDate">
<g-lang xml:lang="no">Fakturadato</g-lang>
<g-lang xml:lang="nn">Fakturadato</g-lang>
<g-lang xml:lang="en">Issue date</g-lang>
</g>
<!--snip -->
</g-funcs>
</document-merge>
</module>
</xsl:variable>
The select statement looks like this:
<xsl:value-of select="$moduleDoc/module/document-merge/g-funcs/g[#name='IssueDate']/g-lang[lang($pLang)]"/>
This works fine when the moduleDoc variable is referencing the external file, but after the merge I get a Saxon error:
To use a result tree fragment in a path expression, either use
exsl:node-set() or specify version='1.1'
What is the proper way to access these nodes in the variable?
I ffigured out that the solution is to use the node-set extension in such a way:
<xsl:value-of select="exsl:node-set($moduleDoc)/module/document-merge/g-funcs/g[#name='IssueDate']/g-lang[lang($pLang)]"/></b> 
A namespace declaration is also needed:
xmlns:exsl="http://exslt.org/common"

Can we declare global variable and add/Remove/Check value from it?

I want to achieve
Declare global variable having no value
<xsl:variable name="IsEqual"/>
Check variable value and change according to condition
<xsl:choose>
**// Checking value equal or not**
<xsl:when test="name=$name">
<xsl:choose>
**//Checking variable value**
<xsl:when test="$IsEqual !='Unequal'">
**//Setting variable value**
<xsl:variable name="IsEqual" Select="Equal"/>
</xsl:when>
</xsl:choose>
</xsl:when>
<xsl:otherwise>
**//Setting variable value**
<xsl:variable name="IsEqual" Select="Unequal"/>
</xsl:otherwise>
</xsl:choose>
<xsl:value-of select="$IsEqual"/>
Expected output value of variable $IsEqual.. IF it is not possible then what is another way to achieve this? What should I use instead of variable?
"It's not a bug, it's a feature": XSLT variables are designed not to be changeable. Actually they could be named constants. Working around that is difficult, it can be done using parameters. In most cases that isn't necessary if you use the XSLT programming attempt, where the programm is driven by the data through templates.
The answer to your question is no.
Copied this (own) text from XML and Variables
#Sam: What do you want to accomplish using a global variable? Where do you want to check for equality? I have no idea what you want to do so I can only give you a general example.
Try this xml file:
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="test.xsl"?>
<root>
<data check="value1">This is data 1</data>
<data check="value2">This is data 2</data>
<data check="value3">This is data 3</data>
</root>
with this xslt file:
<?xml version="1.0"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:variable name="v">value2</xsl:variable>
<xsl:template match="/">
<xsl:apply-templates select="root/data[#check = $v]"/>
</xsl:template>
<xsl:template match="data">
<xsl:value-of select="."/>
</xsl:template>
</xsl:stylesheet>
Only the second data element will show up as this one matches the global variable, which stays the same all the time. If you want different values to match with, you can put them into your data file instead using a variable and compare the different elements.
For testing just save the two files (test.xml and test.xsl) into one directory and open test.xml with your browser.
#Sam again: As you insist on changing an xslt variable I have to repeat that this can't be done. Maybe there is a way around using the environment xslt is running in. E.g. PHP, where you can pass functions into the script. I described the technique here: Can PHP communicate with XSLT?
The xslt spec says:
XSLT does not provide an equivalent to the Java assignment operator
x = "value";
because this would make it harder to create an implementation that processes a document other than in a batch-like way, starting at the beginning and continuing through to the end.
(See http://www.w3.org/TR/xslt#variables to prove my answer "no" to your question is correct :-)

Handling 0x19 in XSLT 1.0

I have encountered a problem when input xml contains the character 0x19. I have created a demo xslt to reproduce the issue.
My demo xslt looks like this:
<?xml version="1.0" encoding="utf-8"?>
<xsl:transform version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:msxsl="urn:schemas-microsoft-com:xslt" exclude-result-prefixes="msxsl">
<xsl:output method="xml" indent="yes"/>
<xsl:param name="param1"/>
<xsl:template match="/">
<Value>
<xsl:value-of select="$param1"/>
</Value>
</xsl:template>
</xsl:transform>
I am passing the character 0x19 as param1. The below output gets generated.
<Value></Value>
which is an invalid xml. How can I get it right?
You are correct that
<Value></Value>
is not well-formed XML 1.0 - XML 1.0 does not allow any control characters below U+0020 except U+0009 (tab), U+000A (LF) and U+000D (CR), not even when expressed as numeric character references, so it is simply not possible to include that character in an XML 1.0 document. The processor is wrong to produce that output, it should raise an error to complain that you've tried to insert an illegal character in the output.
However it is well formed XML 1.1, which allows control characters as &# references but not as literals. If your processor supports this (and the donwnstream components that will be receiving your output support it too) then it may be sufficient to add version="1.1" to the xsl:output instruction
<xsl:output method="xml" indent="yes" version="1.1"/>
to tell it to output XML 1.1 instead of XML 1.0.

XSLT decimal-format causes exception

I am trying to use the xslt:decimal-format element, but I get the same error message whether I use my own code or the example code provided by w3schools.com. This is the w3 sample code:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:decimal-format name="euro" decimal-separator="," grouping-separator="."/>
<xsl:template match="/">
<xsl:value-of select="format-number(26825.8, '#,###.00', 'euro')"/>
</xsl:template>
</xsl:stylesheet>
And this is the XsltException it produces when I run it in Visual Studio 2010:
"Format '#,###.00' cannot have zero digit symbol after digit symbol after decimal point."
What's wrong on my side that causes this error?
You have changed the decimal format, called "euro" so that a valid number looks like this "1.232,99" (one thousand, two hundred and thirty two, point nine-nine in words). This does not match the format you have requested which is "#,###.00".
Change your format-number pattern to "#.###,00"