XSLT 1.0 solution required. My question is similar to XSLT Change element order and I'll take this answer if I have to, but I hope I can do something like 'put this_element first, and retain the original order of all the rest of them'. The input is something like this, where ... can be any set of simple elements or text nodes, but no processing instructions nor comments. See below also.
<someXML>
<recordList>
<record priref="1" created="2009-06-04T16:54:35" modification="2014-12-16T14:56:51" selected="False">
...
<collection_type>3D</collection_type>
...
<object_category>headgear</object_category>
<object_name>hat</object_name>
<object_number>060998</object_number>
...
</record>
<record priref="3" created="2009-06-04T11:54:35" modification="2020-08-05T18:24:33" selected="False">
...
<collection_type>3D</collection_type>
<description>a very elaborate coat</description>
<object_category>clothing</object_category>
<object_name>coat</object_name>
<object_number>060998</object_number>
</record>
</recordList>
</someXML>
This would be the desired output.
<someXML>
<recordList>
<record priref="1" created="2009-06-04T16:54:35" modification="2014-12-16T14:56:51" selected="False">
<object_category>clothing</object_category>
...
<collection_type>3D</collection_type>
...
<object_name>hat</object_name>
<object_number>060998</object_number>
...
</record>
<record priref="3" created="2009-06-04T11:54:35" modification="2020-08-05T18:24:33" selected="False">
<object_category>clothing</object_category>
...
<collection_type>3D</collection_type>
<description>a very elaborate coat</description>
<object_name>coat</object_name>
<object_number>060998</object_number>
</record>
</recordList>
</someXML>
It's probably OK if object_category is put first, and then occurs again later on in the record, i.e. in the tags in their original order.
I'll add some background. There's this API producing about 900.000 XML records with different tags (element names) in alphabetical order, per record. There are about 170 different element names (that's why I don't want to have to list them all individually, unless there's no other way). The XML is ingested into this graph database. That takes time, but it could be sped up if we see the object_category as the first element in the record.
Edit: We can configure the API, but not the C# code behind the API. We step through the database, step by step ingesting chunks of ~100 records. If we specify nothing else, we get the XML as exemplified above. We can also specify an XSL sheet to transform the XML. That's what we want to do here.
The example is ambiguous, because we don't know what all those ... placeholders stand for. I suppose this should work for you:
XSLT 1.0
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:strip-space elements="*"/>
<!-- identity transform -->
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="record">
<xsl:copy>
<xsl:apply-templates select="#*"/>
<xsl:apply-templates select="object_category"/>
<xsl:apply-templates select="node()[not(self::object_category)]"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
Related
I need help to process following XML document using XSL 1.0, and I don't have the words to describe what the actual problem is.
I have a collection of items collectionA I'd like to process. collectionA is related to collectionB from which I'd like to get the sum of all related num values. The relationship is a many-to-many relationship, defined by the collectionC items, that connects the A and B items.
In any other object model based programming language I would use a loop. Problem is that I don't see how to both loop AND to collect the sum of all values in xslt 1.0...?
<root>
<collectionA>
<id>1</id>
</collectionA>
<collectionA>
<id>2</id>
</collectionA>
<collectionB>
<id>1</id>
<num>11</num>
</collectionB>
<collectionB>
<id>2</id>
<num>22</num>
</collectionB>
<collectionB>
<id>3</id>
<num>33</num>
</collectionB>
<collectionB>
<id>4</id>
<num>44</num>
</collectionB>
<collectionC>
<collectionAid>1</collectionAid>
<collectionBid>1</collectionBid>
<collectionBid>2</collectionBid>
</collectionC>
<collectionC>
<collectionAid>2</collectionAid>
<collectionBid>3</collectionBid>
<collectionBid>4</collectionBid>
</collectionC>
</root>
Expected result is:
<root>
<collectionA>
<id>1</id>
<sum>33</sum>
</collectionA>
<collectionA>
<id>2</id>
<sum>77</sum>
</collectionA>
</root>
Anyone!?
I don't see how to both loop AND to collect the sum of all values
You do not need to do either one of these. All you need is a pair of keys to resolve the cross-references. Then use them to select the related nodes and sum their values:
XSLT 1.0
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:key name="b" match="collectionB" use="id" />
<xsl:key name="c" match="collectionC" use="collectionAid" />
<xsl:template match="/root">
<xsl:copy>
<xsl:for-each select="collectionA">
<xsl:copy>
<xsl:copy-of select="id"/>
<sum>
<xsl:value-of select="sum(key('b', key('c', id)/collectionBid)/num)"/>
</sum>
</xsl:copy>
</xsl:for-each>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
This is not the browser type XSLT, this is for processing data (SAP B1 Integration Framework). Suppose we have two SQL Tables, HEADER and LINE and we want to avoid the kind of work where we first SELECT from the HEADER and then launch a separate select for the lines for each, because that requires "visual programming", and we like writing code more than connecting arrows. So we are sending the server a query like SELECT * FROM HEADER, SELECT * FROM LINES and we get an XML roughly like this:
<ResultSets>
<ResultSet>
<Row><MHeaderNum>1</MHeaderNum></Row>
<Row><MHeaderNum>2</MHeaderNum></Row>
<Row><MHeaderNum>3</MHeaderNum></Row>
</ResultSet>
<ResultSet>
<Row><LineNum>1</LineNum> <HeaderNum>1</HeaderNum></Row>
<Row><LineNum>2</LineNum> <HeaderNum>2</HeaderNum></Row>
<Row><LineNum>1</LineNum> <HeaderNum>3</HeaderNum></Row>
<Row><LineNum>2</LineNum> <HeaderNum>1</HeaderNum></Row>
<Row><LineNum>1</LineNum> <HeaderNum>2</HeaderNum></Row>
<Row><LineNum>2</LineNum> <HeaderNum>3</HeaderNum></Row>
</ResultSet>
so we think we are imperative, procedural programmers and pull a
<xsl:for-each select="//ResultSets/Resultset[1]/Row">
do stuff with header data
<xsl:for-each select="//ResultSets/Resultset[2]/Row[HeaderNum=MHeaderNum]">
do stiff with the line data beloning to this particular header
</xsl:for-each>
</xsl:for-each>
And of course this does not blinkin' work because MHeaderNum went out of context like grunge went out of fashion, and we cannot save it into a variable either because we will not be update that variable, as XSLT is something sort of an immutable functional programming language.
But fear not, says an inner voice, because XSLT gurus can solve things like that with templates. Templates, if I understand it, are sort of XSLT's take on functions. They can be recursive and stuff like that. So can they be used to solve problems like this?
And of course we are talking about XSLT 1.0 because I don't know whether Java ever bothered to implement the later versions, but SAP certainly did not bother to used said, hypothetical implementation.
Or should I really forget about this and just connect my visual arrows? The thing is, SQL is not supposed to be used in such an iterate through headers then iterate through lines ways. What I am trying to do is what makes an SQL database a happy, get a big ol' chunk of data out of it and then process it somewhere else, not bother it with seventy zillion tiny queries. And in our case the somewhere else is sadly XSLT, although technically I could try JavaScript as well as SAP added a Nashorn to this pile of mess as well, but maybe it is solvable in "pure" XSL?
Whether XSLT 1 or later and whether with templates and for-each, the current() function exists: //ResultSets/Resultset[2]/Row[HeaderNum=current()/MHeaderNum].
The best way to resolve cross-references is by using a key.
For example, the following stylesheet:
XSLT 1.0
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:key name="line-by-header" match="ResultSet[2]/Row" use="HeaderNum" />
<xsl:template match="/ResultSets">
<output>
<xsl:for-each select="ResultSet[1]/Row">
<header num="{MHeaderNum}">
<xsl:for-each select="key('line-by-header', MHeaderNum)">
<line>
<xsl:value-of select="LineNum"/>
</line>
</xsl:for-each>
</header>
</xsl:for-each>
</output>
</xsl:template>
</xsl:stylesheet>
when applied to the following input:
XML
<ResultSets>
<ResultSet>
<Row><MHeaderNum>1</MHeaderNum></Row>
<Row><MHeaderNum>2</MHeaderNum></Row>
<Row><MHeaderNum>3</MHeaderNum></Row>
</ResultSet>
<ResultSet>
<Row><LineNum>1</LineNum> <HeaderNum>1</HeaderNum></Row>
<Row><LineNum>2</LineNum> <HeaderNum>2</HeaderNum></Row>
<Row><LineNum>3</LineNum> <HeaderNum>3</HeaderNum></Row>
<Row><LineNum>4</LineNum> <HeaderNum>1</HeaderNum></Row>
<Row><LineNum>5</LineNum> <HeaderNum>2</HeaderNum></Row>
<Row><LineNum>6</LineNum> <HeaderNum>3</HeaderNum></Row>
</ResultSet>
</ResultSets>
will return:
Result
<?xml version="1.0" encoding="UTF-8"?>
<output>
<header num="1">
<line>1</line>
<line>4</line>
</header>
<header num="2">
<line>2</line>
<line>5</line>
</header>
<header num="3">
<line>3</line>
<line>6</line>
</header>
</output>
You can try the following XSLT. It is using three XSLT templates.
Because desired output is unknown, I placed some arbitrary processing for each of the header and line item templates.
XSLT
<?xml version="1.0"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/">
<root>
<xsl:apply-templates/>
</root>
</xsl:template>
<xsl:template match="/ResultSets/ResultSet[1]">
<ResultSet1>
<xsl:for-each select="Row">
<r>
<xsl:value-of select="MHeaderNum"/>
</r>
</xsl:for-each>
</ResultSet1>
</xsl:template>
<xsl:template match="/ResultSets/ResultSet[2]">
<ResultSet2>
<xsl:for-each select="Row">
<xsl:copy-of select="."/>
</xsl:for-each>
</ResultSet2>
</xsl:template>
</xsl:stylesheet>
I am completly new to XSLT and my goal is to retrieve value from a node which has namespace on it. My XML looks as below.
<WBIFNMsg>
<AppData>
<AppName>INFM</AppName>
<MsgType>Notification</MsgType>
<MsgStatus>Success</MsgStatus>
</AppData>
<AppMsg>
<Document xmlns="urn:swift:xsd:setr.010.001.03">
<SbcptOrdrV03>
<MsgId>
<Id>29331095XXXXML</Id>
<CreDtTm>2019-06-07T10:30:43.681+02:00</CreDtTm>
</MsgId>
</SbcptOrdrV03>
</Document>
</AppMsg>
</WBIFNMsg>
I am trying to get the value of Id tag which is under document/SbcptOrdrV03/MsgId/ via the xslt. As the document tag has the namespace I have included it in my xslt but still I couldn't retrieve the value.
<?xml version="1.0"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:b="urn:swift:xsd:setr.010.001.03" exclude-result-prefixes="b">
<xsl:output method="xml" omit-xml-declaration="yes"/>
<xsl:template match="/">
<data>
<messageDataApps>
<field>
<fieldName>SENDER.STP</fieldName>
<value>
<xsl:value-of select="b:WBIFNMsg/b:AppMsg/b:Document/b:SbcptOrdrV03/b:MsgId/b:id"/>
</value>
</field>
</messageDataApps>
</data>
</xsl:template>
</xsl:stylesheet>
Appreciate if anyone could help on this. Thanks in advance.
Only Document and its descendants are in a namespace and require using a prefix when addressing them. Instead of:
<xsl:value-of select="b:WBIFNMsg/b:AppMsg/b:Document/b:SbcptOrdrV03/b:MsgId/b:id"/>
try:
<xsl:value-of select="WBIFNMsg/AppMsg/b:Document/b:SbcptOrdrV03/b:MsgId/b:Id"/>
Note also that XML is case-sensitive: b:id is not the same thing as b:Id.
For an integration with a 3th party webservice, I am receiving my "actual" data in a CDATA section.
<getDocumentsReqResponse xmlns="http://tempuri.org/">
<getDocumentsReqResult>
<![CDATA[<?xml version="1.0"?>
<wsResult>
<rsCode>00</rsCode>
<rsMessage>...</rsMessage>
</wsResult>]]></getDocumentsReqResult>
</getDocumentsReqResponse>
So I was trying to use the Inbound path option on the send port. But When I try to do this, I get an empty message. Does this option work with CDATA?
I just entered the xpath (/[local-name()='getDocumentsReqResponse' and namespace-uri()='http://tempuri.org/']/[local-name()='getDocumentsReqResult' and namespace-uri()='http://tempuri.org/']) and put the Node encoding to string.
Or am I doing something wrong. I have used it in the past when I receive an HTML encoded string, but never with CDATA.
I will need an orchestration in the process anyway, so if that is the only option I will have to go for that.
Thanks for the help
Try:
/[local-name()='getDocumentsReqResponse']/[local-name()='getDocumentsReqResult']/text()
Technically in xPath, text() should return the CDATA content. You can test it in any .Net app since it will behave the same.
I fixed it with a simple xslt
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:msxsl="urn:schemas-microsoft-com:xslt" xmlns:var="http://schemas.microsoft.com/BizTalk/2003/var" exclude-result-prefixes="msxsl var s0" version="1.0" xmlns:s0="http://tempuri.org/">
<xsl:output omit-xml-declaration="yes" method="xml" version="1.0" />
<xsl:template match="/">
<xsl:apply-templates select="/s0:getDocumentsReqResponse" />
</xsl:template>
<xsl:template match="/s0:getDocumentsReqResponse">
<xsl:value-of select="normalize-space(s0:getDocumentsReqResult)" disable-output-escaping="yes" />
</xsl:template>
</xsl:stylesheet>
That also does the trick. :-)
It's more of a clarification that I am in need ..
as per this answer on a question, XSLT variables are cheap! My question is: Is this statement valid for all the scenarios? The instant variables which get created and get destroyed withing 4 line code aren't bothersome but loading a root node or child entities, in my opinion is indeed bad practice..
I have two XSLT files, designed for same input and output requirement:
XSLT1 (without unnecessary variable):
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xs="http://www.w3.org/2001/XMLSchema" exclude-result-prefixes="xs">
<xsl:output method="xml" encoding="UTF-8" indent="yes"/>
<xsl:template match="/">
<Collection>
<xsl:for-each select="CATALOG/CD">
<DVD>
<Cover>
<xsl:value-of select="string(TITLE)"/>
</Cover>
<Author>
<xsl:value-of select="string(ARTIST)"/>
</Author>
<BelongsTo>
<xsl:value-of select="concat(concat(string(COUNTRY), ' '), string(COMPANY))"/>
</BelongsTo>
<SponsoredBy>
<xsl:value-of select="string(COMPANY)"/>
</SponsoredBy>
<Price>
<xsl:value-of select="string(number(string(PRICE)))"/>
</Price>
<Year>
<xsl:value-of select="string(floor(number(string(YEAR))))"/>
</Year>
</DVD>
</xsl:for-each>
</Collection>
</xsl:template>
</xsl:stylesheet>
XSLT2 (with unnecessary variable "root" in which whole XML is loaded):
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xs="http://www.w3.org/2001/XMLSchema" exclude-result-prefixes="xs">
<xsl:output method="xml" encoding="UTF-8" indent="yes"/>
<xsl:template match="/">
<xsl:variable name="root" select="."/>
<Collection>
<xsl:for-each select="$root/CATALOG/CD">
<DVD>
<Cover>
<xsl:value-of select="string(TITLE)"/>
</Cover>
<Author>
<xsl:value-of select="string(ARTIST)"/>
</Author>
<BelongsTo>
<xsl:value-of select="concat(concat(string(COUNTRY), ' '), string(COMPANY))"/>
</BelongsTo>
<SponsoredBy>
<xsl:value-of select="string(COMPANY)"/>
</SponsoredBy>
<Price>
<xsl:value-of select="string(number(string(PRICE)))"/>
</Price>
<Year>
<xsl:value-of select="string(floor(number(string(YEAR))))"/>
</Year>
</DVD>
</xsl:for-each>
</Collection>
</xsl:template>
</xsl:stylesheet>
Approach-2 exists in realtime and infact the XML would be several KBs to few MBs, In XSLT usage of variables is extended to child entities as well..
To put-forth my proposal to change the approach, I need to verify the theory behind it..
As per my understanding incase of approach-2, system is reloading the XML data over and over in memory (incase of usage of multiple variables to load child entities the situation turns worst) and thereby slowing down the transformation process.
Before posting this question here I tested the performance of two XSLTs using timer. First approach takes few milliseconds lesser than approach-2. (I used copy-XML files to test two XSL files to avoid complexity with system cache). But again system cache might play huge confusing role here ..
Despite of this analysis of mine I still have a question in mind! Do we really need to avoid usage of variables. And as far as my system is concerned, how worthy is it to modify the realtime XSLT files, so as to use 'approach-1'?
OR Is it like XSLT variables are different than other programming languages (Incase if I'm not aware) .. Say for example, XSLT variables don't actually store the data when you do select="." but they kind of point to the data! or something like this..? AND HENCE continue using XSLT variables without hesitation..
What is your suggestion on this?
Quick Info on current system:
Host Programming Language or System: Siebel (C++ is the backend code)
XSLT Processor: Xalan (Unless Saxon is used explicitely)
I agree with the comments made that you need to measure performance with your particular XSLT processor.
But your descriptions or expectations like "approach-2, system is reloading the XML data over and over in memory" seem wrong to me. The XSLT processor builds an input tree of the primary input XML document anyway and I can't imagine that any implementation then with <xsl:variable name="root" select="."/> does anything like loading the document completely again, it would even be wrong, as node identity and generate-id would not work. The variable will simply keep a reference to the document node of the existing input tree.
Of course in your sample where you have a single input document and a single template where the current node is the document anyway the use of the variable you have is superfluous. But there are cases where you need to store the document node of the primary input document, in particular when you deal with multiple documents.