Recursive transformations using xslt, xpath:document() and mediawiki - api

I want to use the Wikipedia API to find the French pages including the ''SQLTemplate:Infobox Scientifique'' missing in the English version. So, my idea was to process the following document with xproc:
http://fr.wikipedia.org/w/api.php?action=query&format=xml&list=embeddedin&eititle=Template:Infobox%20Scientifique&eilimit=400
and the following xslt stylesheet:
<?xml version='1.0' ?>
<xsl:stylesheet
xmlns:xsl='http://www.w3.org/1999/XSL/Transform'
version='1.0'
>
<xsl:output method='text' indent="yes"/>
<xsl:template match="/">
<xsl:apply-templates select="api"/>
</xsl:template>
<xsl:template match="api">
<xsl:for-each select="query/embeddedin/ei">
<xsl:variable name="title" select="translate(#title,&apos; &apos;,&apos;_&apos;)"/>
<xsl:variable name="english-title">
<xsl:call-template name="englishTitle"><xsl:with-param name="title" select="#title"/></xsl:call-template>
</xsl:variable>
<xsl:value-of select="$english-title"/><xsl:text>
</xsl:text>
</xsl:for-each>
</xsl:template>
<xsl:template name="englishTitle">
<xsl:param name="title"/>
<xsl:variable name="uri1" select="concat(&apos;http://fr.wikipedia.org/w/api.php?action=query&format=xml&prop=langlinks&lllimit=500&titles=&apos;,translate($title,&apos; &apos;,&apos;_&apos;))"/>
<xsl:message><xsl:value-of select="$uri1"/></xsl:message>
<xsl:message>count=<xsl:value-of select="count(document($uri1,/api/query/pages/page/langlinks/ll))"/></xsl:message>
</xsl:template>
</xsl:stylesheet>
The XSLT extract all the articles containing the Template and for each article I wanted to call Wikipedia to get the links between the wikis. Here the template englishTitle calls the xpath function document().
But it always says that count(ll)=1 whereas there are plenty nodes. (e.g. http://fr.wikipedia.org/w/api.php?action=query&format=xml&prop=langlinks&lllimit=500&titles=Carl_Sagan ).
Can't I process the nodes returned by the document() function?

You should try:
<xsl:value-of select="count(document($uri1)/api/query/pages/page/langlinks/ll)"/>
On a different note - what is
translate(#title,&apos; &apos;,&apos;_&apos;)
supposed to mean? What's wrong with:
translate(#title, ' ', '_')
There is no need to encode single quotes in XML attributes unless you want to use a type of quote that delimits the attribute value. All of these are valid:
name="foo"'foo"
name='foo&apos;"foo'
Your entire transformation can be reduced to something like this:
<xsl:stylesheet
version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
>
<xsl:output method="text" />
<xsl:param name="baseUrl" select="'http://fr.wikipedia.org/w/api.php?action=query&format=xml&prop=langlinks&lllimit=500&titles='" />
<xsl:template match="ei">
<xsl:variable name="uri" select="concat($baseUrl ,translate(#title,' ','_'))"/>
<xsl:variable name="doc" select="document($uri)"/>
<xsl:value-of select="$uri"/>
<xsl:text>
</xsl:text>
<xsl:text>count=</xsl:text>
<xsl:value-of select="count($doc/api/query/pages/page/langlinks/ll)"/>
<xsl:text>
</xsl:text>
</xsl:template>
<xsl:template match="text()" />
</xsl:stylesheet>
Let the XSLT default templates work for you - they do all of the recursion in the background, all you have to do is catch the nodes you want to process (and prevent output of unnecessary text by overriding the default text() template with an empty one).

Related

XSLT to CSV - How to control the order of the same element with different attribute values?

I am quite a newbie to XSLT and I need to figure out how to retrieve repeating elements from a XML file in a specific order based on their attribute value to a CSV file. My result document is a CSV file that I ultimately need to import its content to SQL Server tables. Therefore, the order of the retrieved elements within the CSV file does matter as they need to match the table columns defined as headers.
My problem occurs with the Project Content_Detail element that exists in different languages and can appear in any order in the XML source. I only need to extract the German version with the Title, Goal element first, followed by the English version elements.
I use SSIS with MS Visual Studio 2019 to transform my XML to a CSV file. MS Visual Studio supports only XSLT 1.0.
Here is my XML file (EDITED):
<?xml version="1.0" encoding="utf-8"?>
<dta:Projects xmlns:dta="http://domain.test/dta">
<dta:Project>
<dta:Core-Basis>
<dta:Goal>1672</dta:Goal>
</dta:Core-Basis>
<dta:Basisinfo>
<dta:Content>
<dta:Content_Detail Lang="de">
<dta:Title>Wirtschaft</dta:Title>
<dta:Aim>Steigerung</dta:Aim>
</dta:Content_Detail>
<dta:Content_Detail Lang="en">
<dta:Title>Economy</dta:Title>
</dta:Content_Detail>
</dta:Content>
</dta:Basisinfo>
</dta:Project>
<dta:Project >
<dta:Core-Basis>
<dta:Goal>2035</dta:Goal>
</dta:Core-Basis>
<dta:Basisinfo>
<dta:Content>
<dta:Content_Detail Lang="en">
<dta:Title>Environmental Protection</dta:Title>
<dta:Aim>Facilitation</dta:Aim>
</dta:Content_Detail>
<dta:Content_Detail Lang="de">
<dta:Title>Naturschutz</dta:Title>
</dta:Content_Detail>
</dta:Content>
</dta:Basisinfo>
</dta:Project >
</dta:Projects>
This is my XSLT file (EDITED). I tried to add numerous order and sort arguments to the xsl:if block but I couldn’t come up with a working solution yet:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:dta="http://domain.test/dta" exclude-result-prefixes="dta">
<xsl:output method="text" encoding="UTF-8" indent="no"/>
<xsl:template match="/">
<xsl:text>DTA_Goal;DTA_Title_de;DTA_Aim_de;DTA_Title_en;DTA_Aim_en;
</xsl:text>
<xsl:apply-templates mode="find_content"/>
</xsl:template>
<xsl:template match="text()|#*" mode="find_content"/>
<xsl:template match="dta:Project" mode="find_content">
<xsl:value-of select="concat(dta:Core-Basis/dta:Goal,';')"/>
<xsl:apply-templates mode="find_content"/>
</xsl:template>
<xsl:template match="dta:Content_Detail" mode="find_content">
<xsl:if test="#Lang='de'">
<xsl:value-of select="concat(dta:Title,';')"/>
<xsl:value-of select="concat(dta:Aim,';')"/>
</xsl:if>
<xsl:if test="#Lang='en'">
<xsl:value-of select="concat(dta:Title,';')"/>
<xsl:value-of select="concat(dta:Aim,'
')"/>
</xsl:if>
</xsl:template>
</xsl:stylesheet>
This is my CSV output so far, not sorting the Content_detail based on their language value attribute:
DTA_Goal;DTA_Title_de;DTA_Aim_de;DTA_Title_en;DTA_Aim_en;
1672;Wirtschaft;Steigerung;Economy;
2035;Environmental Protection;Facilitation
Naturschutz;;
This is what I need:
DTA_Goal; DTA_Title_de; DTA_Aim_de; DTA_Title_en; DTA_Aim_en;
1672;Wirtschaft;Steigerung;Economy;;
2035;Naturschutz;;Environmental Protection;Facilitation;
As I am not a professional programmer, any help or comment also on the rest of my code is highly appreciated.
In your input, one of the dta:Project elements has a dta:Core-Basis child, while the other has dta:Core-Basisdaten. Assuming that's a mistake*, you could produce the needed output simply by:
XSLT 1.0
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:dta="http://domain.test/dta">
<xsl:output method="text" encoding="UTF-8" />
<xsl:template match="/dta:Projects">
<!-- header -->
<xsl:text>DTA_Goal;DTA_Title_de;DTA_Aim_de;DTA_Title_en;DTA_Aim_en;
</xsl:text>
<!-- data -->
<xsl:for-each select="dta:Project">
<xsl:value-of select="dta:Core-Basis/dta:Goal"/>
<xsl:text>;</xsl:text>
<!-- de -->
<xsl:variable name="content-de" select="dta:Basisinfo/dta:Content/dta:Content_Detail[#Lang='de']" />
<xsl:value-of select="$content-de/dta:Title"/>
<xsl:text>;</xsl:text>
<xsl:value-of select="$content-de/dta:Aim"/>
<xsl:text>;</xsl:text>
<!-- en -->
<xsl:variable name="content-en" select="dta:Basisinfo/dta:Content/dta:Content_Detail[#Lang='en']" />
<xsl:value-of select="$content-en/dta:Title"/>
<xsl:text>;</xsl:text>
<xsl:value-of select="$content-en/dta:Aim"/>
<xsl:text>;</xsl:text>
<xsl:text>
</xsl:text>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
(*) If not, change:
<xsl:value-of select="dta:Core-Basis/dta:Goal"/>
to:
<xsl:value-of select="(dta:Core-Basis|dta:Core-Basisdaten)/dta:Goal"/>
P.S. Not sure why you need the trailing ; at the end of each line.

Remove Trailing zeros using xsl1.0 or xslt 2.0

I have to sum the values from xml and remove the Trailing zeros from xml.
can you help me to remove this using xsl1.0 or xslt2.0
I have tried with number(.) but its not removing the trailing zeros.
My input is below
<test>
<loop>
<lines>
<linesTotal>2010</linesTotal>
</lines>
<lines>
<linesTotal>20</linesTotal>
</lines>
</loop>
</test>
Expected output is
203
but it results 2030
Please help me!
I can't see why this would be useful (at least not in the given example), and I strongly suspect you don't really want to do this.
But - mainly for fun - here's a method to remove any trailing zeros from the result:
XSLT 1.0
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:template match="/test">
<output>
<xsl:call-template name="remove-trailing-zeros">
<xsl:with-param name="number" select="sum(loop/lines/linesTotal)"/>
</xsl:call-template>
</output>
</xsl:template>
<xsl:template name="remove-trailing-zeros">
<xsl:param name="number"/>
<xsl:choose>
<xsl:when test="$number and not($number mod 10)">
<xsl:call-template name="remove-trailing-zeros">
<xsl:with-param name="number" select="$number div 10"/>
</xsl:call-template>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="$number"/>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
</xsl:stylesheet>
In XSLT 2.0, you could make this a function instead of a template. Or move the problem to the string domain (as suggested in another answer):
<xsl:template match="/test">
<xsl:param name="sum" select="sum(loop/lines/linesTotal)"/>
<output>
<xsl:value-of select="replace(string($sum), '0+$', '')"/>
</output>
</xsl:template>
You can use replace function using 2.0:
replace(string(2010+30), '0$', '')

Concat using for each loop and and keyword

Recently I've encountered a case where in i should apply a for each loop and concat strings using 'and' keyword. Below is a part of my xml document.
<?xml version="1.0" encoding="utf-8"?>
<case.ref.no.group>
<case.ref.no>
<prefix>Civil Appeal</prefix>
<number>W-02-887</number>
<year>2008</year>
</case.ref.no>
<case.ref.no>
<prefix>Civil Appeal</prefix>
<number>W-02-888</number>
<year>2008</year>
</case.ref.no>
</case.ref.no.group>
and i tried the below xslt on it.
<xsl:template match="case.ref.no.group">
<xsl:variable name="pre">
<section class="sect2">
<xsl:text disable-output-escaping="yes">Court of Appeal</xsl:text>
</section>
</xsl:variable>
<xsl:variable name="tex">
<xsl:value-of select="./case.ref.no/prefix"/>
</xsl:variable>
<xsl:variable name="iter">
<xsl:value-of select="./case.ref.no/number"/>
<xsl:if test="following::case.ref.no/number">;</xsl:if>
</xsl:variable>
<xsl:variable name="year">
<xsl:value-of select="./case.ref.no/year"/>
</xsl:variable>
<div class="para">
<xsl:value-of select="concat($pre,' – ',$tex,' Nos. ',$iter,'-',$year)"/>
</div>
</xsl:template>
when i try to run it it is giving me the below output.
Court of Appeal – Civil Appeal Nos. W-02-887 2008
but i want it to be as below.
Court of Appeal – Civil Appeal Nos. W-02-887-2008 and W-02-888-2008
please let me know how i can achieve this. i'm doing this in xslt 1.0.
Thanks
I don't understand well what exactly you are trying to do. You mentioned for-each but in your code is not present, you mentioned word and and you don't use it :-)
If I use following stylesheet
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:fo="http://www.w3.org/1999/XSL/Format">
<xsl:output method="xml" indent="yes"/>
<xsl:template match="/">
<output>
<xsl:apply-templates select="case.ref.no.group" />
</output>
</xsl:template>
<xsl:template match="case.ref.no.group">
<section class="sect2">
<xsl:text>Court of Appeal</xsl:text>
</section>
<xsl:text> - </xsl:text>
<xsl:value-of select="case.ref.no[1]/prefix" />
<xsl:text> Nos. </xsl:text>
<xsl:for-each select="case.ref.no">
<xsl:value-of select="number" />
<xsl:text>-</xsl:text>
<xsl:value-of select="year" />
<xsl:if test="not(position() = last())">
<xsl:text> and </xsl:text>
</xsl:if>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
I get this result
<?xml version="1.0" encoding="UTF-8"?>
<output xmlns:fo="http://www.w3.org/1999/XSL/Format"><section class="sect2">Court of Appeal</section> - Civil Appeal Nos. W-02-887-2008 and W-02-888-2008</output>
But as I said I'm not sure if I understand your needs well. For example I'm not sure if you don't need some kind of grouping (prefix will be everytime same in all <case.ref.no> under one parent <case.ref.no.group>?) etc.

how to display content that appears once outside the repeating nodes in the expression of every node in xslt

I've got data in XML that has some header information then a series of items. I'm using XSLT to translate that into a different format, also with a header area and a series of items.
However in the post translation result, I want one piece of data only found in the header to be included in every instance of the items, even though it will simply be repeating the same value. (this value may change so I cannot hard code it)
Sample data (significantly simplified)
<rss>
<channel>
<title>Playlist One</title>
<items>
<item>
<title>Video One</title>
</item>
<item>
<title>Video Two</title>
</item>
<item>
<title>Video Three</title>
</item>
</items>
</channel>
</rss>
My desired result is something like this:
playlist_header_title=Playlist One
playlist_title=Playlist One
video_title=Video One
playlist_title=Playlist One
video_title=Video Two
playlist_title=Playlist One
video_title=Video Three
My XSLT is very complicated (and unfortunately I inherited it from someone else so I'm not sure what everything does, I've self-taught myself online but am in a bit over my head)
Roughly the key pieces look like this:
<xsl:template name="rss" match="/">
<xsl:variable name="playlist_title">
<xsl:value-of select="string(/rss/channel/title)"/>
</xsl:variable>
<xsl:for-each select="rss">
<xsl:apply-templates name="item" select="channel/items/item"/>
</xsl:for-each>
</xsl:template>
Then there's a massive template called "item" that I won't include here, but basically it outputs all the item data as desired, I just can't figure out how to access the "playlist_title".
When I try to call (from inside the template "item")
<xsl:value-of select="string($playlist_title)"/>
it returns a blank. I assume this is because that variable was created outside the for-each loop and so it not available. (it will display the data correctly when I output it before the for-each loop in the result's version of the header, but that doesn't get me far enough)
I've tried using with-param in the apply-templates, and also tried changing it to call-template inside another loop also using with-param, but they also display a blank.
I've also tried sending in a string rather than pulling the playlist_title from the XML just to confirm I am able to pass any value into the template, but they too come out blank.
For instance:
<xsl:for-each select="channel/items/item">
<xsl:call-template name="item">
<xsl:with-param name="playlist_title">blah</xsl:with-param>
</xsl:call-template>
</xsl:for-each>
<xsl:template name="item">
<xsl:param name="playlist_title" />
playlist_title=<xsl:value-of select="string($playlist_title)"/>
video_title=...
...
</xsl:template>
This did not return the value "blah" but just a blank. (my hope was to then replace "blah" with the playlist_title value pulled from the XML, but none of it is getting through)
I'm stumped! Thanks for any help.
In your first example, attempting to reference the $playlist_title inside of your template for item is failing because the variable does not exist inside of that template. It was created inside of the template match for rss and is locally scoped.
If you want to be able to use the $playlist_title in other templates, you need to declare it outside of the rss template in the "global scope".
For instance:
<xsl:variable name="playlist_title">
<xsl:value-of select="string(/rss/channel/title)"/>
</xsl:variable>
<xsl:template name="rss" match="/">
<xsl:for-each select="rss">
<xsl:apply-templates name="item" select="channel/items/item"/>
</xsl:for-each>
</xsl:template>
As for your second attempt, referencing the xsl:param playlist_title, it worked for me. Double-check to ensure that you don't have a type-o or other difference between the name of the parameter that was declared and the name of the parameter that you are referencing.
You might try something like this to achieve your desired output, using an xsl:variable:
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text" indent="yes"/>
<xsl:variable name="playlist_title" select="/rss/channel/title" />
<xsl:template match="/">
<xsl:apply-templates select="rss/channel"/>
</xsl:template>
<xsl:template match="channel">
<xsl:text>playlist_header_title=</xsl:text>
<xsl:value-of select="title"/>
<xsl:text>
</xsl:text>
<xsl:apply-templates select="items/item"/>
</xsl:template>
<xsl:template match="item">
<xsl:text>playlist_title=</xsl:text>
<xsl:value-of select="$playlist_title"/>
<xsl:text>
</xsl:text>
<xsl:text>video_title=</xsl:text>
<xsl:value-of select="title"/>
<xsl:text>
</xsl:text>
</xsl:template>
</xsl:stylesheet>
Or you can eliminate the xsl:variable and use an xsl:param:
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text" indent="yes"/>
<xsl:template match="/">
<xsl:apply-templates select="rss/channel"/>
</xsl:template>
<xsl:template match="channel">
<xsl:text>playlist_header_title=</xsl:text>
<xsl:value-of select="title"/>
<xsl:text>
</xsl:text>
<xsl:apply-templates select="items/item">
<xsl:with-param name="playlist_title" select="title"/>
</xsl:apply-templates>
</xsl:template>
<xsl:template match="item">
<xsl:param name="playlist_title" />
<xsl:text>playlist_title=</xsl:text>
<xsl:value-of select="$playlist_title"/>
<xsl:text>
</xsl:text>
<xsl:text>video_title=</xsl:text>
<xsl:value-of select="title"/>
<xsl:text>
</xsl:text>
</xsl:template>
</xsl:stylesheet>

Adding number in a CSV in XSLT

How can you add numbers in a CSV in XSLT 1 ?
I want to take:
<num>1,2,3</num>
and get the sum of the numbers in the element, so we would get 6 from the above.
Using FXSL and the str-split-to-words template (I am lazy to write recursive templates which is time consuming and error-prone :) ):
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:ext="http://exslt.org/common">
<xsl:import href="strSplit-to-Words.xsl"/>
<xsl:output indent="yes" omit-xml-declaration="yes"/>
<xsl:template match="/">
<xsl:variable name="vwordNodes">
<xsl:call-template name="str-split-to-words">
<xsl:with-param name="pStr" select="."/>
<xsl:with-param name="pDelimiters" select="','"/>
</xsl:call-template>
</xsl:variable>
<xsl:variable name="vNums" select="ext:node-set($vwordNodes)/*"/>
<xsl:value-of select="sum($vNums)"/>
</xsl:template>
</xsl:stylesheet>
When this transformation is applied on the provided XML document:
<num>1,2,3</num>
the wanted, correct result is produced:
6