Faster way to process xml in mule - mule

We are having an xml request that is very huge which contains nearly 10000 xml elements as shown below
<root>
<message></message>
<message></message>
<message></message>
.....
.....
.....
.....
<message></message>
<message></message>
<message></message>
</root>
In mule we are using xpath extractor in for-each processor, which is taking huge amount of time.
Is there a way where we can process huge xml files faster in mule ?
<foreach doc:name="Foreach" batchSize="1" collection="#[xpath://message]">
<!-- stuff -->
</foreach>
Also changing batchSize didn't help.
Is there any other processing way which makes it faster?

we can achieve this by following few best practices in mule
use sax parser instead of DOM for loading the xml
divide the file into chunks and process the chunks parellelly
if you are storing the data in several variables do not store the
complete xml which may cause the memory leak
remove variables at the end of each mule flow
if unnecessary

Related

How to split xml blob content in database into multiple files based on tag in mule

How to split xml file into multiple files based on a tag in xml file using mule
In xml we have <EOF> data based on <EOF> we need to chunk the xml.
You can do something similar to:
<splitter expression="#[xpath('//EOF')]" />
That would generate many messages one for each EOF tag in your XML. Depending on the structure you may need to fix the Xpath expression to be more precise.
Use splitter component in Mule
http://www.mulesoft.org/documentation/display/current/Splitter+Flow+Control+Reference

how to replace xml element value in mule

<healthcare>
<plans>
<plan1>
<planid>100</planid>
<planname>medical</planname>
<desc>medical</desc>
<offerprice>500</offerprice>
<area>texas</area>
</plan1>
<plan2>
<planid>101</planid>
<planname>dental</planname>
<desc>dental</desc>
<offerprice>1000</offerprice>
<area>texas</area>
</plan2>
</plans>
</healthcare>
<splitter evaluator="xpath" expression="/healthcare/plans" doc:name="Splitter"/>
<transformer ref="domToXml" doc:name="Transformer Reference"/>
<logger level="INFO" doc:name="Logger" message=" plans detils...#[message.payload]" />
i have input xml data as above. I want to replace offerprice value from the above xml data . I tried various ways. anyone can shed the light for my requirement in mule
in my requiremnet , hit external api based on the result value , I need to change the offerprice value in the input xml .
anyone help is highly valuable.I need this immediately in my work .please shed light
You can use XSLT to transform the XML file to another XML file (with or without the same schema). Here's a small example of how it would look in mule . . .
http://marcotello.com/mule-esb/using-the-xslt-transformer-in-mule-esb/
There are also a lot of resources for learning how to create XSLT files online / through google.
There are many ways to do it.
You can use Mule XSLT in your flow, which will change the value of offerprice from input xml and you will be getting the required Xml as output with the value you want.
Another way is to use Groovy, with XmlSlurper to parse your input Xml, replace the value, and rebuild the XML you want.
reference :- XML Mapping in Mule
and
Mule: Enriching an XML with additional information from DB
also refer
http://www.ibm.com/developerworks/library/j-pg05199/index.html
Hope this help

How to parse simple xml file

>> ? xml
No information on xml
There's parse-xml but it seems to me that it was for Rebol2.
I've searched for xml scripts in rebol.org and found xml-object.r that seemed to me like the most up to date from all searches.
I know about altxml, too, but the examples given are for html.
So, I'd like to ask about my choices if I want to parse and use information of +1GB of files of this simplified structure:
<?xml version="1.0" encoding="Windows-1252" standalone="yes"?>
<SalesFile xmlns="urn:StandardSalesFile-1.0">
<Header>
<SalesFileVersion>1.01</SalesFileVersion>
<DateCreation>2014-04-30</DateCreation>
</Header>
<SalesInvoices>
<Invoice>
<InvoiceNo>INV 1/1</InvoiceNo>
<DocumentStatus>
<InvoiceStatus>N</InvoiceStatus>
<InvoiceStatusDate>2014-01-03T17:57:59</InvoiceStatusDate>
</DocumentStatus>
</Invoice>
<Invoice>
<InvoiceNo>INV 2/1</InvoiceNo>
<DocumentStatus>
<InvoiceStatus>N</InvoiceStatus>
<InvoiceStatusDate>2014-01-03T17:59:12</InvoiceStatusDate>
</DocumentStatus>
</Invoice>
</SalesInvoices>
</SalesFile>
Is Rebol3 going to have a parse-xml tool? Should I use xml-object? If so how? Because it's still beyong my novice level of the language. Other option?
There is also a Rebol 3 library by Christopher Ross-Gill called alt-xml.
http://www.ross-gill.com/page/XML_and_REBOL
This can translate the XML to either a block! or object! representation.
Your question states that these XML files are large and may not fit in main memory. I would suggest that creating 1GB XML files is not best practice as many parsers, including this one, do attempt to load the files into memory.
To support this you will have to chunk the files yourself by using open on the file and copy/part chunks out of the file. This is a bit messy, but it will work.
One way to make this cleaner is to use parse as per HostileFork's suggestion and modify the series as you parse it. Parse is very flexible in this regard.
Ideally parse would be able to work directly on port! objects, but this is only a future wish list item at the moment.
Do you really need to deal with the XML file as structure? If not, have you considered just using PARSE?
(Warning: the following is untested, I'm just presenting the concept.)
Invoices: copy []
parse my-doc [
<?xml version="1.0" encoding="Windows-1252" standalone="yes"?>
thru <SalesFile xmlns="urn:StandardSalesFile-1.0">
thru <Header>
thru <SalesFileVersion> copy SalesFileVersion to </SalesFileVersion>
</SalesFileVersion>
thru <DateCreation> copy DateCreation to </DateCreation>
</DateCreation>
thru </Header>
thru <SalesInvoices>
any [
thru <Invoice>
(Invoice: object [])
thru <InvoiceNo> copy InvoiceNo to </InvoiceNo>
</InvoiceNo>
(Invoice/No: InvoiceNo)
thru <DocumentStatus>
thru <InvoiceStatus> copy InvoiceStatus to </InvoiceStatus>
</InvoiceStatus>
(Invoice/Status: InvoiceStatus)
thru <InvoiceStatusDate> copy InvoiceStatusDate to </InvoiceStatusDate>
</InvoiceStatusDate>
(Invoice/StatusDate: InvoiceStatusDate)
thru </DocumentStatus>
thru </Invoice>
]
thru </SalesInvoices>
thru </SalesFile>
to end
]
If you know you have well-formed XML and don't want a dependency on a library for processing clunky-ol' XML, Rebol can get pretty far and clear with PARSE. As TAG! is just a subclass of string, you can make things look relatively literate. And it's much more lightweight to just work with the strings.
Though if structural manipulations are required, you'll need something that makes a DOM. Altxml is the go-to right now, AFAIK.
(Hmm...I had a name for the pattern copy x to <foo> <foo> that escapes me at the moment, but this is a good case for it.)
%Rebol-Dom.r or %rebol-dom-mdlparser.r,
If your willing to use rebol2 with parse to seek thru to the node-name, then copy a chunk of data, you could feed that to Rebol-Dom.r getnodename "salesInvoice" and append that node-element to a block repetitively.

XPATH in mule flow returns multiple values

I have a requirement where I run xPath evaluator against xml payload in mule flow. This xPath evaluator can return single or multiple values. I need to store these values in flow variable and use somewhere later in the flow. Can someone help me implementing this changes?
Appreciate your help on this.
Thanks
For extracting values from an XML document, use the XPath extractor.
<mulexml:xpath-extractor-transformer expression="/a:my/b:xpath/text()"/>
You can also use Mule Expression Language to create dynamic XPath expressions:
<expression-transformer mimeType="text/xml" evaluator="xpath" expression="//school/day[#date= #[function:datestamp:yyyy-MM-dd] ]/name
"/>
However this can get somewhat messy for complicated expressions, so I have created my own dynamic XPath transformer:
<dx:dxpath expression="/b:team[name = $teamName]/b:player[b:name = $playerName]/b:goals/text()">
<dx:variable key="playerName" value="#[header:invocation:playerName]"/>
<dx:variable key="teamName" value="#[header:invocation:teamName]"/>
<!-- unlimited number of variables -->
</dx:dxpath>
which is somewhat more easy on the eyes.
Then Wrap your flow with an ericher:
<enricher target="#[variable:myData]">
<processor-chain>
<!-- your flow here -->
</processor-chain>
</enricher>

Import Xml nodes as Xml column with SSIS

I'm trying to use the Xml Source to shred an XML source file however I do not want the entire document shredded into tables. Rather I want to import the xml Nodes into rows of Xml.
a simplified example would be to import the document below into a table called "people" with a column called "person" of type "xml". When looking at the XmlSource --- it seem that it suited to shredding the source xml, into multiple records --- not quite what I'm looking for.
Any suggestions?
<people>
<person>
<name>
<first>Fred</first>
<last>Flintstone</last>
</name>
<address>
<line1>123 Bedrock Way</line>
<city>Drumheller</city>
</address>
</person>
<person>
<!-- more of the same -->
</person>
</people>
I didn't think that SSIS 2005 supported the XML datatype at all. I suppose it "supports" it as DT_NTEXT.
In any case, you can't use the XML Source for this purpose. You would have to write your own. That's not actually as hard as it sounds. Base it on the examples in Books Online. The processing would consist of moving to the first child node, then calling XmlReader.ReadSubTree to return a new XmlReader over just the next <person/> element. Then use your favorite XML API to read the entire <person/>, convert the resulting XML to a string, and pass it along down the pipeline. Repeat for all <person/> nodes.
Could you perhaps change your xml output so that the content of person is seen as a string? Use escape chars for the <>.
You could use a script task to parse it as well, I'd imagine.