Remove unmatched parentheses and unmatched brackets from XML content - xslt-1.0

I saw questions related for Ruby, etc. I didn't see any answers that would cover this case.
I'm using XSLT to transform an XML document.
I'm forced to use XSLT 1.0, since the application that uses XSLT and XPATH is still on 1.0.
Would you use Javascript function embedded in XSL using regex?
I have xml code like this:
<document>
<content name="PROD_MAJ_CLS_CD" type="text" vse-streams="2" u="20" action="cluster" weight="1">2</content>
<content name="PART_DESC_SHORT" type="text" vse-streams="2" u="22" action="cluster" weight="1">SCREW-ROCKER</content>
</document>
The content attribute where name="PART_DESC_SHORT" can have parentheses and brackets in it
Thanks,
Paul

In your example, none of the parentheses match, and all are open parentheses; for that case, the task of removing unmatched parentheses and brackets amounts to removing all parentheses and brackets. This is most easily done with the translate function:
<xsl:value-of select="translate(.,'([','')"/>
If you actually need to retain matched braces and translate input of the form a(b[c]d(e into ab[c]de, then the simplest way to do it in XSLT is with a recursive named template which takes two parameters: an input string and a stack recording the material processed so far. (The stack can be just a sequence of strings delimited by a separator you choose, that you hope won't appear in the input. Since you show the input as coming from an attribute value, &#x9; will probably be a safe choice (it will have been normalized out of the input unless the data producer used a numeric character reference for it), but you can use |||never-use-this-string-in-a-part-description||| instead if you like.)
On the initial call, you pass just the input string, and let the stack default to empty.
On each call with non-empty input, the template takes one character off of the input stream and does the right thing with it:
for "(" and "[", push a new string onto the stack.
for "]" and ")" matching the first character of the top item on the stack, pop the stack twice, concatenate item 2, item 1 (the old top item), and the character "]" or ")", and push the concatenation onto the stack.
For "]" and ")" that don't match, the input is not well balanced, and you need to do something. (The code here drops the character.)
For any other character, append it to the top item on the stack.
So at any given time, every item on the stack except the bottom one begins with an open parenthesis or bracket which has not yet been matched. Every time we get a match, the parenthesized string is appended to the next item on the stack, thus reducing the stack size.
When the input string is exhausted and the stack has more than one item in it, the leading brace on the stack's top item needs to be stripped and the top item (minus the brace) appended to the next item.
Once the stack is down to one item, that item contains the string you wanted.
Here it is in XSLT:
<xsl:template name="paren-match-or-strip">
<xsl:param name="input"/>
<xsl:param name="stack"/>
<xsl:variable name="char"
select="substring($input,1,1)"/>
<xsl:variable name="stacktop"
select="substring-before($stack,$sep)"/>
<xsl:variable name="stackrest"
select="substring-after($stack,$sep)"/>
<xsl:choose>
<xsl:when test="not($input = '')">
<xsl:choose>
<xsl:when test="$char = '(' or $char = '['">
<!--* Push another potential-left-brace on the stack *-->
<xsl:call-template name="paren-match-or-strip">
<xsl:with-param name="input"
select="substring($input,2)"/>
<xsl:with-param name="stack"
select="concat(
$char,
$sep,
$stack
)"/>
</xsl:call-template>
</xsl:when>
<xsl:when test="($char = ']' and substring($stacktop,1,1) = '[')
or
($char = ')' and substring($stacktop,1,1) = '(')
">
<!--* Match the left brace at the top of the stack *-->
<xsl:variable name="stacktop2"
select="substring-before($stackrest,$sep)"/>
<xsl:variable name="stackrest2"
select="substring-after($stackrest,$sep)"/>
<xsl:call-template name="paren-match-or-strip">
<xsl:with-param name="input"
select="substring($input,2)"/>
<xsl:with-param name="stack"
select="concat(
$stacktop2,
$stacktop,
$char,
$sep,
$stackrest
)"/>
</xsl:call-template>
</xsl:when>
<xsl:when test="$char = ']' or $char = ')'">
<!--* Unmatched right brace, drop it silently *-->
<xsl:call-template name="paren-match-or-strip">
<xsl:with-param name="input"
select="substring($input,2)"/>
<xsl:with-param name="stack"
select="$stack"/>
</xsl:call-template>
</xsl:when>
<xsl:otherwise>
<xsl:call-template name="paren-match-or-strip">
<xsl:with-param name="input"
select="substring($input,2)"/>
<xsl:with-param name="stack"
select="concat(
$stacktop,
$char,
$sep,
$stackrest
)"/>
</xsl:call-template>
</xsl:otherwise>
</xsl:choose>
</xsl:when>
<xsl:when test="$input = '' and contains($stackrest,$sep)">
<!--* Input is exhausted and at least one unmatched
* parenthesis is on the stack.
*-->
<xsl:variable name="stacktop2"
select="substring-before($stackrest,$sep)"/>
<xsl:variable name="stackrest2"
select="substring-after($stackrest,$sep)"/>
<xsl:call-template name="paren-match-or-strip">
<xsl:with-param name="input"
select="$input"/>
<xsl:with-param name="stack"
select="concat(
$stacktop2,
substring($stacktop,2),
$sep,
$stackrest2
)"/>
</xsl:call-template>
</xsl:when>
<xsl:otherwise>
<!--* input is exhausted, stack only has one item *-->
<xsl:value-of select="substring-before($stack,$sep)"/>
</xsl:otherwise>
</xsl:choose>
</xsl:template>

Related

xslt 3.0 conditional dynamic element creation

I have a variable:
<xsl:variable name="courseType" select="Record[1]/course-type"/>
and based on a value I would like to create a dynamic element:
<xsl:if test="$courseType ='B'">
<xsl:element name="newElement">
</xsl:if>
...
other nodes
...
<xsl:if test="$courseType ='B'">
</xsl:element>
</xsl:if>
The problem is that:
The element type "xsl:element" must be terminated by the matching end-tag "".
is there any way to achieve this?
You haven't provided much content but you can certainly use e.g.
<xsl:variable name="other-nodes">
...
</xsl:variable>
<xsl:choose>
<xsl:when test="$course-type = 'B'">
<newElement>
<xsl:sequence select="$other-nodes"/>
</newElement>
</xsl:when>
<xsl:otherwise>
<xsl:sequence select="$other-nodes"/>
</xsl:otherwise>
</xsl:choose>
Note that I have used a literal result element for newElement instead of using xsl:element as I only consider the use of xsl:element necessary if you want to compute the element name at run-time but it does not matter for the solution of the original problem whether you use a literal result element or xsl:element.

xslt string replace concrete example

I've found the template Codesling has provided as a substitute to the replace-function in XSLT 1.0: XSLT string replace
The problem I'm having is that I cannot figure out how to adapt it to my concrete example, so I'm hoping for some pointers here.
I've got a xml-parameter called ./bib-info that looks like this (for example):
<bib-info>Gäbler, René, 1971- [(DE-588)138691134]: Schnell-Umstieg auf Office 2010, [2010]</bib-info>
My goal is to replace the blanks between the "]:" and the beginning of the following text (in this case "Schnell"). The number of blanks is always the same btw.
Here are two other examples:
<bib-info>Decker, Karl-Heinz, 1948-2010 [(DE-588)141218622]: Funktion, Gestaltung und Berechnung, 1963</bib-info>
<bib-info>Decker, Karl-Heinz, 1948-2010 [(DE-588)141218622]: Maschinenelemente, 1963</bib-info>
Could anybody please give me a hint how to write the calling code
<xsl:variable name="newtext">
<xsl:call-template name="string-replace-all">
<xsl:with-param name="text" select="$text" />
<xsl:with-param name="replace" select="a" />
<xsl:with-param name="by" select="b" />
</xsl:call-template>
</xsl:variable>
so that it matches my problem?
Thanks in advance
Kate
My goal is to replace the blanks between the "]:" and the beginning of the following text
An example doing this is the following:
<xsl:template match="bib-info">
<xsl:variable name="newtext">
<xsl:call-template name="string-replace-all">
<xsl:with-param name="text" select="text()" />
<xsl:with-param name="replace" select="']: '" />
<xsl:with-param name="by" select="']: '" />
</xsl:call-template>
</xsl:variable>
<bib-info><xsl:value-of select="$newtext" /></bib-info>
</xsl:template>
Note that this only replaces real spaces and not tabs or other special characters.
BTW replacing strings may not be the best choice in this case. You can achieve the same with the substring functions of XSLT-1.0:
<xsl:template match="bib-info">
<xsl:variable name="newtext1" select="substring-before(text(),']: ')" />
<xsl:variable name="newtext2" select="substring-after(text(),']: ')" />
<xsl:variable name="newtext" select="concat($newtext1,']: ',$newtext2)" />
<bib-info><xsl:value-of select="$newtext" /></bib-info>
</xsl:template>
The output is the same for both templates.

How to multiply numbers with each other in xslt 1.0

I have a number for example 123, this is stored in an element let's say
<Number>123</Number>. I am looking for a solution written in XSLT 1.0 which can do something like: 1*2*3 and provide me the result as 6. The value in Number element can be in any length. I know i can do this through substring function and by storing the values one by one in variables but the problem is, i dont know the length of this field.
I could not write any xslt for this.
Can anyone help or suggest a solution for this?
You can do this by calling a recursive named template:
<xsl:template name="digit-product">
<xsl:param name="digits"/>
<xsl:param name="prev-product" select="1"/>
<xsl:variable name="product" select="$prev-product * substring($digits, 1, 1)" />
<xsl:choose>
<xsl:when test="string-length($digits) > 1">
<!-- recursive call -->
<xsl:call-template name="digit-product">
<xsl:with-param name="digits" select="substring($digits, 2)"/>
<xsl:with-param name="prev-product" select="$product"/>
</xsl:call-template>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="$product"/>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
Working example: http://xsltransform.net/3NJ38YJ
Note that this assumes the value of the passed digits parameter is an integer.

XSL : How to pass variable to for each

<xsl:for-each select="$script/startWith">
<xsl:variable name = "i" >
<xsl:value-of select="$script/startWith[position()]"/>
</xsl:variable>
<xsl:for-each select="JeuxDeMots/Element">
<xsl:variable name = "A" >
<xsl:value-of select="eName"/>
</xsl:variable>
<xsl:if test="starts-with($A,$i)= true()">
<xsl:variable name="stringReplace">
<xsl:value-of select="substring($i,0,3)"/>
</xsl:variable>
<xsl:value-of select="$stringReplace"/>
</xsl:if>
</xsl:for-each>
</xsl:for-each>
Problem : variable $i, can not pass in an xsl for-each.
Please help me.
Try replacing the declaration of i with
<xsl:variable name="i" select="string(.)"/>
The context item (i.e. ".") is different for each evaluation of the for-each instruction, but the value of the expression $script/startWith[position()] does not change. (Here, you are making a sequence of startWith elements, and testing each one for the effective boolean value of the expression position(). The expression position() returns a positive integer, so its effective boolean value is always true. So the predicate [position()] is doing no work at all here.
Also, you'll want to replace the declaration of stringReplace with
<xsl:variable name="stringReplace" select="substring($i,1,3)"/>
(String offsets begin with 1 in XPath, not with 0.)
I am guessing that you want to process all of the startWith children of $script, and for each one emit the first three characters of the startWith value once, for each startWith/JeuxDeMots/Element whose eName child begins with the value of startWith.
Actually, the whole thing might be a little easier to read if it were briefer and more direct:
<xsl:for-each select="$script/startWith">
<xsl:variable name = "i" select="string()"/>
<xsl:for-each select="JeuxDeMots/Element">
<xsl:if test="starts-with(eName,$i)">
<xsl:value-of select="substring($i,1,3)"/>
</xsl:if>
</xsl:for-each>
</xsl:for-each>
It makes perfect sense to create the variables $A and $stringReplace if they are used several times in code you're not showing us, but if not, ...
I think the problem is that you change context in first for-each. Then the element JeuxDeMots is not "visible" for second for-each. You can for example try to store it in variable and use this variable in second for-each (there are also another ways how to figure out this problem).
<xsl:template match="/">
<xsl:variable name="doc" select="JeuxDeMots"/>
<xsl:choose>
<xsl:when test="$script/startWith">
<xsl:for-each select="$script/startWith">
<xsl:variable name="i">
<xsl:value-of select="."/>
</xsl:variable>
<xsl:for-each select="$doc/Element">
<xsl:variable name="A">
<xsl:value-of select="eName"/>
</xsl:variable>
<xsl:if test="starts-with($A,$i) = true()">
<xsl:variable name="stringReplace">
<xsl:value-of select="substring($i,0,3)"/>
</xsl:variable>
<xsl:value-of select="$stringReplace"/>
</xsl:if>
</xsl:for-each>
</xsl:for-each>
</xsl:when>
</xsl:choose>
</xsl:template>
Although I'm not sure what exactly you are dealing with it seems to output desired value AsAt.
You could also consider C.M.Sperberg-McQueen post about string offset in XPath.

xslt - how to create a variable with text content of space characters of a dynamically determined length that will perform well

I would like to create a variable that contains a text value that is a number of space characters but the number of characters is not known until runtime. This needs to be done a great many times so i would like to use something that will perform very well.
One option is substring() function on a previously declared node, but this limits the length to no more than that of the original text.
Another would be to use a recursive template with concat() function, but not sure about the performance implications.
What is a way to do this that will perform very well?
I would create a global variable of length say 4000 spaces. Then if the desired string is less than 4000 spaces, use substring(); if greater, use a recursive approach as outlined by Durkin. In 2.0 of course the code is much simpler to write but choosing an approach that performs well is still an interesting little problem.
You can use recursion and divide and conquer. This will give you run-time cost of order (log N). Alternatively, if your XSLT processor implements tail-end recursion optimisation, then you can use a tail-end recursion solution for better safety at scale. Here are the two solutions...
For non tail-end recursion optimising XSLT processors...
<xsl:template name="spaces">
<xsl:param name="count" />
<xsl:choose>
<xsl:when test="$count <= 10">
<xsl:value-of select="substring(' ',1,$count)" />
</xsl:when>
<xsl:otherwise>
<xsl:call-template name="spaces">
<xsl:with-param name="count" select="$count idiv 2" />
</xsl:call-template>
<xsl:call-template name="spaces">
<xsl:with-param name="count" select="$count - ($count idiv 2)" />
</xsl:call-template>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
For tail-end recursion optimising XSLT processors...
<xsl:template name="spaces">
<xsl:param name="count" />
<xsl:choose>
<xsl:when test="$count <= 10">
<xsl:value-of select="substring(' ',1,$count)" />
</xsl:when>
<xsl:otherwise>
<xsl:text> </xsl:text>
<xsl:call-template name="spaces">
<xsl:with-param name="count" select="$count - 10" />
</xsl:call-template>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
For an efficient (both time and space) XSLT 1.0 solution, see this:
http://www.sourceware.org/ml/xsl-list/2001-07/msg01040.html