Transforming a single xml to many documents - mule

I have to create a mapping between two xsd schemas, where the input document contains a list (sequence) of elements, each of which maps to a single output document. Moreover, each output document should include top level input data that is not a part of the list. To illustrate the problem, the input document contains data about a customer (contact info, etc) and list of invoices for them, and the output should be multiple documents, each containing one invoice and the customer data.
Can I somehow do this using DataMapper or some other approach? If I create a mapping between the input list elements and the output document, DataMapper will output an aggregation of all the created output documents. It also seems that I can not refer to the input top level elements from inside the "list element to output document" mapping.

Supposing root element in your source XSD contained a list of "Item" elements, you could first split the document into Items:
<splitter expression="#[xpath('//Item')]" doc:name="Splitter" enableCorrelation="IF_NOT_SET"/>
And then after the splitter use a DataMapper to map the Item elements to a target element in your other XSD. DataMapper requires that "Item" also be a root element in your source XSD in order to do the mapping from XSD to XSD. If it's not possible/desirable to make "Item" a root element in the source XSD, then you could create a sample XML and use DataMapper to generate an XSD from that. Otherwise you could roll your own transformer or use the XSLT transformer.

Related

How to split xml blob content in database into multiple files based on tag in mule

How to split xml file into multiple files based on a tag in xml file using mule
In xml we have <EOF> data based on <EOF> we need to chunk the xml.
You can do something similar to:
<splitter expression="#[xpath('//EOF')]" />
That would generate many messages one for each EOF tag in your XML. Depending on the structure you may need to fix the Xpath expression to be more precise.
Use splitter component in Mule
http://www.mulesoft.org/documentation/display/current/Splitter+Flow+Control+Reference

Mapping records to specific path in EMT when node names are duplicated

We have an Externally Managed Taxonomy (EMT) and have been using a node's name to map records to the hierarchy. We are now hitting a problem because some of the node names in the hierarchy are duplicated. Ids are used to make the nodes in the EMT unique, but I haven't found documentation on how to use something other than name to map a record. For example, how do I map a records to child_2 below, rather than child_1 if both are named "A child"?
Root [id=root]
|-One parent #id=parent_1 #parent=root
| '- A child #id=child_1 #parent=parent_1
'-Other parent #id=parent_2 #parent=root
'- A child #id=child 2 #parent=parent_2
If you read through the DTD File (for instance C:\Endeca\PlatformServices\11.1.0\conf\dtd\external_dimensions.dtd ), you can try the following.
<node name="One" id="1" classify="false">
<synonym name="1"/>
</node>
... where you could specify alternative values as the synonyms. "One" would be displayed. If your source data has "One", it would not map (because classify=false). Your source data would have to have "1" in order to be mapped.
I'm not 100% sure since I don't have an EMT to play with, FYI.

XML Import with "alternate" form or xml formatting

I have successfully imported an XML file parsing elements info table attributes using this xml data formating:
<PN>
<guid>aaaa</guid>
<dataInput>0</dataInput>
<deleted>false</deleted>
<customField1></customField1>
<customField2></customField2>
<customField3></customField3>
<description></description>
<name>name1></name>
<ccid>CC007814</ccid>
<productIds>bbbb</productIds>
</PN>
but it errors whwen I input an XML in this format:
<PN guid="aaaa"
deleted="false"
customField1=""
customField2=""
customField3=""
description=""
modified="2010-10-20T00:00:00.001"
created="2010-05-20T18:07:10.416"
name="name1"
ccid="CC006035"
productIds="bbbb"/>
Is this later form usable? Any help would be appreciated. Thanks.
It's usable, but you're looking at the difference between using tags (your first example) and attributes (your second example). Your processing is slightly different.

Extracting dates from html meta data in FAST-ESP

During document processing I want to extract all dates from html meta data and then identify the latest date which will be used to populate a date field (dtgeneric1).
<meta name="OriginalPublicationDate" content="2010/04/21 12:06:36" />
<meta name="LastModificationDate" content="2010/04/22 14:10:16" />
+ other non-date meta data
Inspection using spy stages shows that our pipeline already adds meta_* attributes but the meta data names will be different across documents from different sources.
#### ATTRIBUTE meta_originalpublicationdate <class 'docproc.DocumentAttributes.TextChunks'>: 2010/04/21 12:06:36
#### ATTRIBUTE meta_lastmodificationdate <class 'docproc.DocumentAttributes.TextChunks'>: 2010/04/22 14:10:16
+ other non-date meta attributes
Ideally we would like to pass all the meta_* attributes to a Python stage and use that to work out which are dates and which is the largest but there seems to be no way of specifying "all meta attributes" as input.
Has anyone done something similar and can offer any advice on the best way to do this.
Thanks
Neil
I suppose that a custom stage that takes all the needed date attributes as an input, processes a comparison between all them (to find the newest date), and outputs the most up-to-date field will do the job.

Import Xml nodes as Xml column with SSIS

I'm trying to use the Xml Source to shred an XML source file however I do not want the entire document shredded into tables. Rather I want to import the xml Nodes into rows of Xml.
a simplified example would be to import the document below into a table called "people" with a column called "person" of type "xml". When looking at the XmlSource --- it seem that it suited to shredding the source xml, into multiple records --- not quite what I'm looking for.
Any suggestions?
<people>
<person>
<name>
<first>Fred</first>
<last>Flintstone</last>
</name>
<address>
<line1>123 Bedrock Way</line>
<city>Drumheller</city>
</address>
</person>
<person>
<!-- more of the same -->
</person>
</people>
I didn't think that SSIS 2005 supported the XML datatype at all. I suppose it "supports" it as DT_NTEXT.
In any case, you can't use the XML Source for this purpose. You would have to write your own. That's not actually as hard as it sounds. Base it on the examples in Books Online. The processing would consist of moving to the first child node, then calling XmlReader.ReadSubTree to return a new XmlReader over just the next <person/> element. Then use your favorite XML API to read the entire <person/>, convert the resulting XML to a string, and pass it along down the pipeline. Repeat for all <person/> nodes.
Could you perhaps change your xml output so that the content of person is seen as a string? Use escape chars for the <>.
You could use a script task to parse it as well, I'd imagine.