How do I use XSLT to merge XML elements and then structure elements/attributes of the same name/type? - sql

Main issue
I have an XML file which has multiple similar elements with slightly different information. So, lets say I have two elements with the same X attirbute, but different Y attributes. Here is an example:
<FILE>
<ELEMENT>
<ATTRIBUTEX>1</ATTRIBUTEX>
<ATTRIBUTEY>A</ATTRIBUTEY>
</ELEMENT>
<ELEMENT>
<ATTRIBUTEX>1</ATTRIBUTEX>
<ATTRIBUTEY>B</ATTRIBUTEY>
</ELEMENT>
</FILE>
What I want to do is--based on the fact that both elements have the same X attribute--merge them into a single element with one attribute X and multiple child elements (of the same type)--each of which contain each of the different attributes Y. So, for example, I want my file to end up like this:
<FILE>
<ELEMENT>
<ATTRIBUTEX>1</ATTRIBUTEX>
<NEWELEMENT>
<ATTRIBUTEY>A</ATTRIBUTEY>
</NEWELEMENT>
<NEWELEMENT>
<ATTRIBUTEY>B</ATTRIBUTEY>
</NEWELEMENT>
</ELEMENT>
</FILE>
Possible solution?
One possible solution I can think of, but lack the knowledge to execute, is made up of two steps:
Firstly, I could merge all elements which contain the X attribute so that the information is, at the least, in one place. I don't know quite how to do this.
Secdonly, once this is done, this opens me up--in theory--to use my knowledge of XSLT to re-structure the element.
Problem with the possible solution
However, once all elements that containin the X attribute are merged together (first step), I will have a file in which there will be attributes with the same name (specifically, the Y attribute). Here is an example:
<FILE>
<ELEMENT>
<ATTRIBUTEX>1</ATTRIBUTEX>
<ATTRIBUTEY>A</ATTRIBUTEY>
<ATTRIBUTEY>B</ATTRIBUTEY>
</ELEMENT>
</FILE>
What this means is that--at least with my knowledge--when I execute an XSLT file on the above XML (second step), it cannot distinguish between the two Y elements when sorting them into the two new child elements (NEWELEMENT).
So, let's say I execute the following XSLT on the above XML:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="FILE">
<Record>
<xsl:for-each select = "ELEMENT">
<ELEMENT>
<xsl:copy-of select="ATTRIBUTEX"/>
<NEWELEMENT>
<xsl:copy-of select="ATTRIBUTEY"/>
</NEWELEMENT>
<NEWELEMENT>
<xsl:copy-of select="ATTRIBUTEY"/>
</NEWELEMENT>
</ELEMENT>
</xsl:for-each>
</Record>
</xsl:template>
</xsl:stylesheet>
The output is this:
<Record>
<ELEMENT>
<ATTRIBUTEX>1</ATTRIBUTEX>
<NEWELEMENT>
<ATTRIBUTEY>A</ATTRIBUTEY>
<ATTRIBUTEY>B</ATTRIBUTEY>
</NEWELEMENT>
<NEWELEMENT>
<ATTRIBUTEY>A</ATTRIBUTEY>
<ATTRIBUTEY>B</ATTRIBUTEY>
</NEWELEMENT>
</ELEMENT>
</Record>
As you can see, the XSLT has taken all and any ATTRIBUTEY elements and put them into each NEWELEMENT rather than distinguishing between them and placing one ATTRIBUTEY in the first NEWELEMENT, and the second ATTRIBUTEY in the second NEWELEMENT. I am in need of an XSLT file which does just that, and, as stated further up, produces an XML that looks like this:
<FILE>
<ELEMENT>
<ATTRIBUTEX>1</ATTRIBUTEX>
<NEWELEMENT>
<ATTRIBUTEY>A</ATTRIBUTEY>
</NEWELEMENT>
<NEWELEMENT>
<ATTRIBUTEY>B</ATTRIBUTEY>
</NEWELEMENT>
</ELEMENT>
</FILE>
Would anyone be able to help me with this? Any solutions which generate the desired output (above) from the very first example file would be much appreciated! It's a bonus if that solution follows steps one and two. Thanks!

I'm getting your requested output from the XSLT 3.0 below.
Just a couple of notes beforehand:
You mention ATTRIBUTEs but, in fact, everything is an element in your XML document. There are no attributes in the XML sense.
I really don't see the need to create the additional subelements .. you can already have a list of elements of the same name.
Nonetheless, as an exercise in style(sheets), here is a stylesheet that produces your output from your given input:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="3.0"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
exclude-result-prefixes="#all">
<xsl:mode on-no-match="shallow-copy"/>
<xsl:output method="xml" indent="yes" />
<xsl:template match="/FILE" >
<FILE>
<xsl:for-each-group select="*" group-by="name(.)" >
<xsl:variable name="parentName" as="xs:string" select="current-grouping-key()" />
<xsl:variable name="childNamesSameValues" as="xs:string*" >
<xsl:for-each-group select="current-group()/*" group-by="name(.)" >
<xsl:if test="count(distinct-values(current-group()/text())) eq 1">
<xsl:sequence select="current-grouping-key()" />
</xsl:if>
</xsl:for-each-group>
</xsl:variable>
<xsl:element name="{$parentName}" >
<xsl:for-each-group select="current-group()/*" group-by="name(.)" >
<xsl:choose>
<xsl:when test="current-grouping-key() = $childNamesSameValues">
<xsl:copy-of select="current-group()[1]" />
</xsl:when>
<xsl:otherwise >
<xsl:for-each select="current-group()" >
<xsl:element name="NEW{$parentName}" >
<xsl:copy-of select="." />
</xsl:element>
</xsl:for-each>
</xsl:otherwise>
</xsl:choose>
</xsl:for-each-group>
</xsl:element>
</xsl:for-each-group>
</FILE>
</xsl:template>
</xsl:stylesheet>

Related

XSLT to CSV - How to control the order of the same element with different attribute values?

I am quite a newbie to XSLT and I need to figure out how to retrieve repeating elements from a XML file in a specific order based on their attribute value to a CSV file. My result document is a CSV file that I ultimately need to import its content to SQL Server tables. Therefore, the order of the retrieved elements within the CSV file does matter as they need to match the table columns defined as headers.
My problem occurs with the Project Content_Detail element that exists in different languages and can appear in any order in the XML source. I only need to extract the German version with the Title, Goal element first, followed by the English version elements.
I use SSIS with MS Visual Studio 2019 to transform my XML to a CSV file. MS Visual Studio supports only XSLT 1.0.
Here is my XML file (EDITED):
<?xml version="1.0" encoding="utf-8"?>
<dta:Projects xmlns:dta="http://domain.test/dta">
<dta:Project>
<dta:Core-Basis>
<dta:Goal>1672</dta:Goal>
</dta:Core-Basis>
<dta:Basisinfo>
<dta:Content>
<dta:Content_Detail Lang="de">
<dta:Title>Wirtschaft</dta:Title>
<dta:Aim>Steigerung</dta:Aim>
</dta:Content_Detail>
<dta:Content_Detail Lang="en">
<dta:Title>Economy</dta:Title>
</dta:Content_Detail>
</dta:Content>
</dta:Basisinfo>
</dta:Project>
<dta:Project >
<dta:Core-Basis>
<dta:Goal>2035</dta:Goal>
</dta:Core-Basis>
<dta:Basisinfo>
<dta:Content>
<dta:Content_Detail Lang="en">
<dta:Title>Environmental Protection</dta:Title>
<dta:Aim>Facilitation</dta:Aim>
</dta:Content_Detail>
<dta:Content_Detail Lang="de">
<dta:Title>Naturschutz</dta:Title>
</dta:Content_Detail>
</dta:Content>
</dta:Basisinfo>
</dta:Project >
</dta:Projects>
This is my XSLT file (EDITED). I tried to add numerous order and sort arguments to the xsl:if block but I couldn’t come up with a working solution yet:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:dta="http://domain.test/dta" exclude-result-prefixes="dta">
<xsl:output method="text" encoding="UTF-8" indent="no"/>
<xsl:template match="/">
<xsl:text>DTA_Goal;DTA_Title_de;DTA_Aim_de;DTA_Title_en;DTA_Aim_en;
</xsl:text>
<xsl:apply-templates mode="find_content"/>
</xsl:template>
<xsl:template match="text()|#*" mode="find_content"/>
<xsl:template match="dta:Project" mode="find_content">
<xsl:value-of select="concat(dta:Core-Basis/dta:Goal,';')"/>
<xsl:apply-templates mode="find_content"/>
</xsl:template>
<xsl:template match="dta:Content_Detail" mode="find_content">
<xsl:if test="#Lang='de'">
<xsl:value-of select="concat(dta:Title,';')"/>
<xsl:value-of select="concat(dta:Aim,';')"/>
</xsl:if>
<xsl:if test="#Lang='en'">
<xsl:value-of select="concat(dta:Title,';')"/>
<xsl:value-of select="concat(dta:Aim,'
')"/>
</xsl:if>
</xsl:template>
</xsl:stylesheet>
This is my CSV output so far, not sorting the Content_detail based on their language value attribute:
DTA_Goal;DTA_Title_de;DTA_Aim_de;DTA_Title_en;DTA_Aim_en;
1672;Wirtschaft;Steigerung;Economy;
2035;Environmental Protection;Facilitation
Naturschutz;;
This is what I need:
DTA_Goal; DTA_Title_de; DTA_Aim_de; DTA_Title_en; DTA_Aim_en;
1672;Wirtschaft;Steigerung;Economy;;
2035;Naturschutz;;Environmental Protection;Facilitation;
As I am not a professional programmer, any help or comment also on the rest of my code is highly appreciated.
In your input, one of the dta:Project elements has a dta:Core-Basis child, while the other has dta:Core-Basisdaten. Assuming that's a mistake*, you could produce the needed output simply by:
XSLT 1.0
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:dta="http://domain.test/dta">
<xsl:output method="text" encoding="UTF-8" />
<xsl:template match="/dta:Projects">
<!-- header -->
<xsl:text>DTA_Goal;DTA_Title_de;DTA_Aim_de;DTA_Title_en;DTA_Aim_en;
</xsl:text>
<!-- data -->
<xsl:for-each select="dta:Project">
<xsl:value-of select="dta:Core-Basis/dta:Goal"/>
<xsl:text>;</xsl:text>
<!-- de -->
<xsl:variable name="content-de" select="dta:Basisinfo/dta:Content/dta:Content_Detail[#Lang='de']" />
<xsl:value-of select="$content-de/dta:Title"/>
<xsl:text>;</xsl:text>
<xsl:value-of select="$content-de/dta:Aim"/>
<xsl:text>;</xsl:text>
<!-- en -->
<xsl:variable name="content-en" select="dta:Basisinfo/dta:Content/dta:Content_Detail[#Lang='en']" />
<xsl:value-of select="$content-en/dta:Title"/>
<xsl:text>;</xsl:text>
<xsl:value-of select="$content-en/dta:Aim"/>
<xsl:text>;</xsl:text>
<xsl:text>
</xsl:text>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
(*) If not, change:
<xsl:value-of select="dta:Core-Basis/dta:Goal"/>
to:
<xsl:value-of select="(dta:Core-Basis|dta:Core-Basisdaten)/dta:Goal"/>
P.S. Not sure why you need the trailing ; at the end of each line.

BizTalk Map XSLT to escape characters within node

I have a map with a source xml standard xml file and destination file which needs to be a flat file.
I've found an article to assist with the conversion to string, the issue I am having now is that the output is not valid against the destination schema and I believe I need to escape the characters to get around this. I'm looking for some help with the XSLT for that.
Source file
<request>
<token>aldkfj</token>
<sms>
<name>name</name>
<contact>
<number>0123465254</number>
<message>hello</message>
</contact>
</sms>
</request>
Destination (required)
<SMS>
<HttpBody>
<Key>xml</Key>
<Value><request&gt<token>95wF-8BpA-Lnyh</token><sms ><name>test</name><contact><number>447968296947</number></contact><message>API Message</message></sms></request></Value>
</HttpBody>
</ns0:SMS>
Template XSLT proposed:
<xsl:template name="xml-to-string-called-template">
<xsl:param name="param1" />
<xsl:element name="Value">
<xsl:call-template name="identity" />
</xsl:element>
</xsl:template>
<xsl:template name="identity" match="#*|node()">
<xsl:copy>
<xsl:text disable-output-escaping="no">
<xsl:apply-templates select="#*|node()" />
</xsl:text>
</xsl:copy>
</xsl:template>

Looping a variable length array with namespaces in XSLT

My previous question[1] is related to this. I found the answer for that. Now I want to loop a variable length array with namespaces. My array:
<ns:array xmlns:ns="http://www.example.org">
<value>755</value>
<value>5861</value>
<value>4328</value>
<value>2157</value>
<value>1666</value>
</ns:array>
My XSLT code:(have added the namespace in the root)
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:ns1="http://www.example.org">
<xsl:template match="/">
<xsl:variable name="number" select="ns:array" />
<xsl:for-each select="$number">
<xsl:value-of select="$number" />
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
[1]https://stackoverflow.com/questions/20287219/looping-a-variable-length-array-in-xslt
IMHO you confused yourself by introducing a variable called number which actually contains a node set of value tags. Then, as a consequence you used your variable as singe item/node which does not yield the desired result (presumingly, since you did not really tell us what you want to do with the values).
Also, I think your question does not really have anything to with namespace issues as such. You just have to make sure that the namespaces in your select expressions match the namespaces in your input file.
I would suggest to do without the variable and change the way you retrieve the current value:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:ns1="http://www.example.org">
<xsl:template match="/">
<xsl:for-each select="ns:array">
<!-- Inside here you can work with the `value` tag as the _current node_.
There are two most likely ways to do this. -->
<!-- a) Copy the whole tag to the output: -->
<xsl:copy-of select="." />
<!-- or b1) Copy the text part contained in the tag to the output: -->
<xsl:value-of select="." />
<!-- If you want to be on the safe side with respect to white space
you can also use this b2). This would handle the case that your output
is required not to have any white space in it but your imput XML has
some. -->
<xsl:value-of select="normalize-space(.)" />
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>

how to display content that appears once outside the repeating nodes in the expression of every node in xslt

I've got data in XML that has some header information then a series of items. I'm using XSLT to translate that into a different format, also with a header area and a series of items.
However in the post translation result, I want one piece of data only found in the header to be included in every instance of the items, even though it will simply be repeating the same value. (this value may change so I cannot hard code it)
Sample data (significantly simplified)
<rss>
<channel>
<title>Playlist One</title>
<items>
<item>
<title>Video One</title>
</item>
<item>
<title>Video Two</title>
</item>
<item>
<title>Video Three</title>
</item>
</items>
</channel>
</rss>
My desired result is something like this:
playlist_header_title=Playlist One
playlist_title=Playlist One
video_title=Video One
playlist_title=Playlist One
video_title=Video Two
playlist_title=Playlist One
video_title=Video Three
My XSLT is very complicated (and unfortunately I inherited it from someone else so I'm not sure what everything does, I've self-taught myself online but am in a bit over my head)
Roughly the key pieces look like this:
<xsl:template name="rss" match="/">
<xsl:variable name="playlist_title">
<xsl:value-of select="string(/rss/channel/title)"/>
</xsl:variable>
<xsl:for-each select="rss">
<xsl:apply-templates name="item" select="channel/items/item"/>
</xsl:for-each>
</xsl:template>
Then there's a massive template called "item" that I won't include here, but basically it outputs all the item data as desired, I just can't figure out how to access the "playlist_title".
When I try to call (from inside the template "item")
<xsl:value-of select="string($playlist_title)"/>
it returns a blank. I assume this is because that variable was created outside the for-each loop and so it not available. (it will display the data correctly when I output it before the for-each loop in the result's version of the header, but that doesn't get me far enough)
I've tried using with-param in the apply-templates, and also tried changing it to call-template inside another loop also using with-param, but they also display a blank.
I've also tried sending in a string rather than pulling the playlist_title from the XML just to confirm I am able to pass any value into the template, but they too come out blank.
For instance:
<xsl:for-each select="channel/items/item">
<xsl:call-template name="item">
<xsl:with-param name="playlist_title">blah</xsl:with-param>
</xsl:call-template>
</xsl:for-each>
<xsl:template name="item">
<xsl:param name="playlist_title" />
playlist_title=<xsl:value-of select="string($playlist_title)"/>
video_title=...
...
</xsl:template>
This did not return the value "blah" but just a blank. (my hope was to then replace "blah" with the playlist_title value pulled from the XML, but none of it is getting through)
I'm stumped! Thanks for any help.
In your first example, attempting to reference the $playlist_title inside of your template for item is failing because the variable does not exist inside of that template. It was created inside of the template match for rss and is locally scoped.
If you want to be able to use the $playlist_title in other templates, you need to declare it outside of the rss template in the "global scope".
For instance:
<xsl:variable name="playlist_title">
<xsl:value-of select="string(/rss/channel/title)"/>
</xsl:variable>
<xsl:template name="rss" match="/">
<xsl:for-each select="rss">
<xsl:apply-templates name="item" select="channel/items/item"/>
</xsl:for-each>
</xsl:template>
As for your second attempt, referencing the xsl:param playlist_title, it worked for me. Double-check to ensure that you don't have a type-o or other difference between the name of the parameter that was declared and the name of the parameter that you are referencing.
You might try something like this to achieve your desired output, using an xsl:variable:
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text" indent="yes"/>
<xsl:variable name="playlist_title" select="/rss/channel/title" />
<xsl:template match="/">
<xsl:apply-templates select="rss/channel"/>
</xsl:template>
<xsl:template match="channel">
<xsl:text>playlist_header_title=</xsl:text>
<xsl:value-of select="title"/>
<xsl:text>
</xsl:text>
<xsl:apply-templates select="items/item"/>
</xsl:template>
<xsl:template match="item">
<xsl:text>playlist_title=</xsl:text>
<xsl:value-of select="$playlist_title"/>
<xsl:text>
</xsl:text>
<xsl:text>video_title=</xsl:text>
<xsl:value-of select="title"/>
<xsl:text>
</xsl:text>
</xsl:template>
</xsl:stylesheet>
Or you can eliminate the xsl:variable and use an xsl:param:
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text" indent="yes"/>
<xsl:template match="/">
<xsl:apply-templates select="rss/channel"/>
</xsl:template>
<xsl:template match="channel">
<xsl:text>playlist_header_title=</xsl:text>
<xsl:value-of select="title"/>
<xsl:text>
</xsl:text>
<xsl:apply-templates select="items/item">
<xsl:with-param name="playlist_title" select="title"/>
</xsl:apply-templates>
</xsl:template>
<xsl:template match="item">
<xsl:param name="playlist_title" />
<xsl:text>playlist_title=</xsl:text>
<xsl:value-of select="$playlist_title"/>
<xsl:text>
</xsl:text>
<xsl:text>video_title=</xsl:text>
<xsl:value-of select="title"/>
<xsl:text>
</xsl:text>
</xsl:template>
</xsl:stylesheet>

Recursive transformations using xslt, xpath:document() and mediawiki

I want to use the Wikipedia API to find the French pages including the ''SQLTemplate:Infobox Scientifique'' missing in the English version. So, my idea was to process the following document with xproc:
http://fr.wikipedia.org/w/api.php?action=query&format=xml&list=embeddedin&eititle=Template:Infobox%20Scientifique&eilimit=400
and the following xslt stylesheet:
<?xml version='1.0' ?>
<xsl:stylesheet
xmlns:xsl='http://www.w3.org/1999/XSL/Transform'
version='1.0'
>
<xsl:output method='text' indent="yes"/>
<xsl:template match="/">
<xsl:apply-templates select="api"/>
</xsl:template>
<xsl:template match="api">
<xsl:for-each select="query/embeddedin/ei">
<xsl:variable name="title" select="translate(#title,&apos; &apos;,&apos;_&apos;)"/>
<xsl:variable name="english-title">
<xsl:call-template name="englishTitle"><xsl:with-param name="title" select="#title"/></xsl:call-template>
</xsl:variable>
<xsl:value-of select="$english-title"/><xsl:text>
</xsl:text>
</xsl:for-each>
</xsl:template>
<xsl:template name="englishTitle">
<xsl:param name="title"/>
<xsl:variable name="uri1" select="concat(&apos;http://fr.wikipedia.org/w/api.php?action=query&format=xml&prop=langlinks&lllimit=500&titles=&apos;,translate($title,&apos; &apos;,&apos;_&apos;))"/>
<xsl:message><xsl:value-of select="$uri1"/></xsl:message>
<xsl:message>count=<xsl:value-of select="count(document($uri1,/api/query/pages/page/langlinks/ll))"/></xsl:message>
</xsl:template>
</xsl:stylesheet>
The XSLT extract all the articles containing the Template and for each article I wanted to call Wikipedia to get the links between the wikis. Here the template englishTitle calls the xpath function document().
But it always says that count(ll)=1 whereas there are plenty nodes. (e.g. http://fr.wikipedia.org/w/api.php?action=query&format=xml&prop=langlinks&lllimit=500&titles=Carl_Sagan ).
Can't I process the nodes returned by the document() function?
You should try:
<xsl:value-of select="count(document($uri1)/api/query/pages/page/langlinks/ll)"/>
On a different note - what is
translate(#title,&apos; &apos;,&apos;_&apos;)
supposed to mean? What's wrong with:
translate(#title, ' ', '_')
There is no need to encode single quotes in XML attributes unless you want to use a type of quote that delimits the attribute value. All of these are valid:
name="foo"'foo"
name='foo&apos;"foo'
Your entire transformation can be reduced to something like this:
<xsl:stylesheet
version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
>
<xsl:output method="text" />
<xsl:param name="baseUrl" select="'http://fr.wikipedia.org/w/api.php?action=query&format=xml&prop=langlinks&lllimit=500&titles='" />
<xsl:template match="ei">
<xsl:variable name="uri" select="concat($baseUrl ,translate(#title,' ','_'))"/>
<xsl:variable name="doc" select="document($uri)"/>
<xsl:value-of select="$uri"/>
<xsl:text>
</xsl:text>
<xsl:text>count=</xsl:text>
<xsl:value-of select="count($doc/api/query/pages/page/langlinks/ll)"/>
<xsl:text>
</xsl:text>
</xsl:template>
<xsl:template match="text()" />
</xsl:stylesheet>
Let the XSLT default templates work for you - they do all of the recursion in the background, all you have to do is catch the nodes you want to process (and prevent output of unnecessary text by overriding the default text() template with an empty one).