Export html to csv using xslt - Handle comma and special characters - xslt-1.0

<?xml version="1.0"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>
<xsl:strip-space elements="*" />
<xsl:variable name="delimiter" select="','" />
<xsl:template match="/"><xsl:apply-templates/></xsl:template>
<xsl:template match="TR"><xsl:apply-templates/><xsl:text>
</xsl:text> </xsl:template>
<xsl:template match="TD">
<xsl:for-each select="#*[not(name() = 'class' or name() = 'style')]">
<xsl:attribute name="{name()}"><xsl:value-of select="."/> </xsl:attribute>
</xsl:for-each>
<xsl:apply-templates/>
<xsl:if test="position() != last()">,</xsl:if>
</xsl:template>
</xsl:stylesheet>
I am using above XSL to convert HTML to CSV but if TD element contains comma it is being treated as separate column,I need to check if TD element data contains comma(,) if yes include data in double quotes.How do I achieve this?

Related

XSLT 1.0: How to determine the position of a node based on the value

thanks for taking the time!
I am trying to get the position of a node based on its value, so that I can then get corresponding entries. Please let me show you with an example, it will be easier to explain.
Here is the source XML:
<?xml version="1.0" encoding="UTF-8"?>
<sampleSourceXML>
<Table>
<Header>
<Column>Name</Column>
<Column>Surname</Column>
<Column>Title</Column>
</Header>
<Body>
<Row>
<Entry>James</Entry>
<Entry>Bond</Entry>
<Entry>Mr</Entry>
</Row>
<Row>
<Entry>Harry</Entry>
<Entry>Potter</Entry>
<Entry>Mr</Entry>
</Row>
</Body>
</Table>
</sampleSourceXML>
This is the resulting XML that I would like to achieve:
<?xml version="1.0" encoding="UTF-8"?>
<sampleResultXML>
<Person>
<FirstName>James</FirstName>
<LastName>Bond</LastName>
<Title>Mr</Title>
</Person>
<Person>
<FirstName>Harry</FirstName>
<LastName>Potter</LastName>
<Title>Mr</Title>
</Person>
</sampleResultXML>
I tried it with this XSLT:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="/">
<xsl:element name="sampleResultXML">
<xsl:for-each select="sampleSourceXML/Table/Rows/Row">
<xsl:element name="Person">
<xsl:element name="FirstName">
<xsl:value-of select="./Entry[1]/text()"></xsl:value-of>
</xsl:element>
<xsl:element name="LastName">
<xsl:value-of select="./Entry[2]/text()"></xsl:value-of>
</xsl:element>
<xsl:element name="Title">
<xsl:value-of select="./Entry[3]/text()"></xsl:value-of>
</xsl:element>
</xsl:element>
</xsl:for-each>
</xsl:element>
</xsl:template>
</xsl:stylesheet>
This works.
But! The source XML can have a different order. For example the Title element can be above the Name element. So the hardcoded position in my XSLT (e.g. Entry[3]) will not work any more.
The header order and the entry order are always corresponding. So if the Column Title is in position 3, so will be the corresponding row.
So I need a way to get the position of, let's say, the Title element dynamically. So I have to go through the Column elements until I reach the one with value "Title". This position I thought I'd save in a variable (e.g. Position_Title) so that I can fill the value later like so:
<xsl:element name="Title">
<xsl:value-of select="./Entry[$Position_Title]/text()"></xsl:value-of>
</xsl:element>
However I haven't been able to find a way to do this...
Do you have any ideas how I can achieve this?
Thanks!
Nick
EDIT: Thanks to Michael for the help! Here is the final XSLT I'm using:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="utf-8" indent="yes"/>
<xsl:template match="/sampleSourceXML">
<xsl:variable name="cols" select="Table/Header/Column"/>
<sampleResultXML>
<xsl:for-each select="Table/Body/Row">
<Person>
<xsl:for-each select="Entry">
<xsl:variable name="i" select="position()"/>
<xsl:variable name="elementName">
<xsl:choose>
<xsl:when test="$cols[$i]='Name'">
<xsl:value-of select="'FirstName'"/>
</xsl:when>
<xsl:when test="$cols[$i]='Surname'">
<xsl:value-of select="'LastName'"/>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="$cols[$i]"/>
</xsl:otherwise>
</xsl:choose>
</xsl:variable>
<xsl:element name="{$elementName}">
<xsl:value-of select="."/>
</xsl:element>
</xsl:for-each>
</Person>
</xsl:for-each>
</sampleResultXML>
</xsl:template>
</xsl:stylesheet>
So I need a way to get the position of, let's say, the Title element dynamically. So I have to go through the Column elements until I reach the one with value "Title". This position I thought I'd save in a variable (e.g. Position_Title) so that I can fill the value later like so:
I think it could be much simpler if you do it from the opposite direction. Try:
XSLT 1.0
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="utf-8" indent="yes"/>
<xsl:template match="/sampleSourceXML">
<xsl:variable name="cols" select="Table/Header/Column" />
<sampleResultXML>
<xsl:for-each select="Table/Body/Row">
<Person>
<xsl:for-each select="Entry">
<xsl:variable name="i" select="position()" />
<xsl:element name="{$cols[$i]}">
<xsl:value-of select="."/>
</xsl:element>
</xsl:for-each>
</Person>
</xsl:for-each>
</sampleResultXML>
</xsl:template>
</xsl:stylesheet>

Apply templates that matchs two conditions

I need to list all the INST names but only if the "onlyTesters" node don´t exists in the "inst/idef" part of XML body above.
I know thats strange but I can´t change the XML I receive.
XML:
<river>
<station num="699">
<inst name="FLU(m)" num="1">
<idef></idef>
</inst>
<inst name="Battery(V)" num="18">
<idef>
<onlyTesters/>
</idef>
</inst>
</station>
<INST name="PLU(mm)" num="0" hasData="1" virtual="0"/>
<INST name="FLU(m)" num="1" hasData="1" virtual="0"/>
<INST name="Q(m3/s)" num="3" hasData="1" virtual="1"/>
<INST name="Battery(V)" num="18" hasData="1" virtual="0"/>
</river>
XSL:
<xsl:template match="/">
<xsl:apply-templates select="//INST[#hasData = 1 and not(//inst[#num=(current()/#num)]/idef/onlyTesters)]/#name"/>
</xsl:template>
<xsl:template match="//INST[#hasData = 1 and not(//inst[#num=(current()/#num)]/idef/onlyTesters)]/#name">
<xsl:value-of select="#name"/>,
</xsl:template>
I´m having no match.
This is the result I expect:
PLU(mm),FLU(m),Q(m3/s)
You can achieve this with only one template:
<xsl:template match="/">
<xsl:for-each select="//INST[#hasData='1' and not(#name=//inst[idef/onlyTesters]/#name)]">
<xsl:value-of select="#name"/>
<xsl:if test="position() != last()">, </xsl:if>
</xsl:for-each>
</xsl:template>
Output is:
PLU(mm), FLU(m), Q(m3/s)
Cross-references are best resolved using a key - for example:
XSLT 1.0
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text" encoding="UTF-8" />
<xsl:key name="inst" match="inst" use="#name" />
<xsl:template match="/river">
<xsl:for-each select="INST[#hasData = 1 and not(key('inst', #name)/idef/onlyTesters)]">
<xsl:value-of select="#name"/>
<xsl:if test="position() != last()">,</xsl:if>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
Or even simpler:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text" encoding="UTF-8" />
<xsl:key name="exclude" match="onlyTesters" use="ancestor::inst/#name" />
<xsl:template match="/river">
<xsl:for-each select="INST[#hasData = 1 and not(key('exclude', #name))]">
<xsl:value-of select="#name"/>
<xsl:if test="position() != last()">, </xsl:if>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>

xslt - create empty file using xslt 1.0

I am trying to create an empty file through xslt.
The input sample is:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<Businessman>
<siblings>
<sibling>John </sibling>
</siblings>
<child> Pete </child>
<child> Ken </child>
</Businessman>
When the input contains any presence of 'child' tags, it should produce the file AS IS. When the input does not have any 'child' tag, I need an empty file (0 byte file) created.
This is what I tried:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes" />
<xsl:template match="#*|node()">
<xsl:choose>
<xsl:when test="/Businessman/child">
<xsl:copy>
<xsl:apply-templates select="#*|node()" />
</xsl:copy>
</xsl:when>
</xsl:choose>
</xsl:template>
</xsl:stylesheet>
This gives the file unchanged when there is any 'child' tag present. But did not produce any empty file when there is no 'child' tag.
The file I need to test will look like:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<Businessman>
<siblings>
<sibling>John </sibling>
</siblings>
</Businessman>
Any help would be great!
Thanks
If you want the processor to go to the trouble of opening the output file, you have to give it something to write to the output file. Try an empty text node. And you only need to make the decision 'copy or not?' once.
One way to make the decision just once and produce empty output if the condition is not met would be to replace your template with:
<xsl:template match="/">
<xsl:choose>
<xsl:when test="/Businessman/child">
<xsl:copy-of select="*"/>
</xsl:when>
<xsl:otherwise>
<xsl:text/>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
This works as expected with xsltproc. (If you find yourself getting a file containing an XML declaration and nothing else, try adjusting the parameters on xsl:output.)
But when I have found myself with a similar situation (perform this transform if condition C holds, otherwise ...), I have simply added a template for the document node that would look something like this for your case:
<xsl:choose>
<xsl:when test="/Businessman/child">
<xsl:apply-templates/>
</
<xsl:otherwise>
<xsl:message terminate="yes">No children in this input, dying ...</
</
</
That way I get no output at all rather than zero-length output.
Simple enough - Just don't try to do everything in one template, don't forget to omit the xml declaration and get the xpath right:
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="yes" omit-xml-declaration="yes" />
<xsl:template match="/">
<xsl:apply-templates />
</xsl:template>
<xsl:template match="Businessman[child]" priority="9">
<xsl:element name="Businessman">
<xsl:apply-templates />
</xsl:element>
</xsl:template>
<xsl:template match="Businessman" priority="0" />
<xsl:template match="#* | node()">
<xsl:copy>
<xsl:apply-templates select="#* | node()"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>

How to Parse Query parameters in XSLT

I have a requirement I will get the URL in the format like this below
https://sample.com?first=one&second=two&third=three
How can I form an xml structure below which will be formed by making use of this query parameters
<first>one</first><second>two</second><third>three</third>
can Please somebody help me with thisand provide me an xslt for this requirement
Given the following XML input (note the escaping of the ampersand character):
<URL>https://sample.com?first=one&second=two&third=three</URL>
the folowing stylesheet:
XSLT 1.0
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:template match="/">
<output>
<xsl:call-template name="tokenize">
<xsl:with-param name="text" select="substring-after(URL, '?')"/>
</xsl:call-template>
</output>
</xsl:template>
<xsl:template name="tokenize">
<xsl:param name="text"/>
<xsl:param name="delimiter" select="'&'"/>
<xsl:variable name="token" select="substring-before(concat($text, $delimiter), $delimiter)" />
<xsl:if test="$token">
<xsl:element name="{substring-before($token, '=')}">
<xsl:value-of select="substring-after($token, '=')"/>
</xsl:element>
</xsl:if>
<xsl:if test="contains($text, $delimiter)">
<!-- recursive call -->
<xsl:call-template name="tokenize">
<xsl:with-param name="text" select="substring-after($text, $delimiter)"/>
</xsl:call-template>
</xsl:if>
</xsl:template>
</xsl:stylesheet>
will return:
<?xml version="1.0" encoding="UTF-8"?>
<output>
<first>one</first>
<second>two</second>
<third>three</third>
</output>
Note that this will work only if the query field names are also valid XML element names.

Remove all single characters from string using xslt1.0

<title>
<article_title>Land a b c d Band</article_title>
</title>
using the following function
replace(article_title, '(^[^ ]+)(.+\s+)([^ ]+)$', '$1 $3')
this string in is transformed to Land Band which is exactly what i want.
but the problem is i need this solution in xslt 1.0 since the java app that i am working with can only handle xslt 1.0 parsing.
This XSLT 1.0 transformation (there is a nasty SO bug and the code isn't indented -- I apologize for this visual mess...):
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="node()|#*">
<xsl:copy>
<xsl:apply-templates select="node()|#*"/>
</xsl:copy>
</xsl:template>
<xsl:template match="text()" name="removeSingles">
<xsl:param name="pText" select="."/>
<xsl:variable name="vText" select="normalize-space($pText)"/>
<xsl:if test="string-length($vText)">
<xsl:variable name="vLeftChars" select=
"substring-before(concat($vText, ' '), ' ')"/>
<xsl:if test="string-length($vLeftChars) >1">
<xsl:value-of select="$vLeftChars"/>
<xsl:if test=
"not(string-length($vLeftChars)
>=
string-length($vText)
)
">
<xsl:text> </xsl:text>
</xsl:if>
</xsl:if>
<xsl:call-template name="removeSingles">
<xsl:with-param name="pText" select=
"substring-after($vText, ' ')"/>
</xsl:call-template>
</xsl:if>
</xsl:template>
</xsl:stylesheet>
when applied on the provided XML document:
<title>
<article_title>Land a b c d Band</article_title>
</title>
produces the wanted, correct result:
<title>
<article_title>Land Band</article_title>
</title>