'File' access is not allowed due to restriction set by the accessExternalSchema property - xml-validation

I'm trying to validate xml payload with XSD, where this XSD is referring other and the other is referring someother. Something like nested reference.
When I include all the .xsd's in Validate Schema path, I still get:
Root Exception stack trace: org.xml.sax.SAXParseException;
schema_reference: Failed to read schema document 'MPProduct.xsd',
because 'file' access is not allowed due to restriction set by the
accessExternalSchema property.
om.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.createSAXParseException(ErrorHandlerWrapper.java:203)
at
com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.fatalError(ErrorHandlerWrapper.java:177)
at
com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:400)
at
com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:306)
at
com.sun.org.apache.xerces.internal.impl.xs.traversers.XSDHandler.reportSchemaErr(XSDHandler.java:4160)
at
com.sun.org.apache.xerces.internal.impl.xs.traversers.XSDHandler.reportSchemaFatalError(XSDHandler.java:4135)
at
com.sun.org.apache.xerces.internal.impl.xs.traversers.XSDHandler.getSchemaDocument(XSDHandler.java:2172)
at
com.sun.org.apache.xerces.internal.impl.xs.traversers.XSDHandler.resolveSchema(XSDHandler.java:2100)
at
com.sun.org.apache.xerces.internal.impl.xs.traversers.XSDHandler.constructTrees(XSDHandler.java:1104)
at
com.sun.org.apache.xerces.internal.impl.xs.traversers.XSDHandler.parseSchema(XSDHandler.java:623)
at
com.sun.org.apache.xerces.internal.impl.xs.XMLSchemaLoader.loadSchema(XMLSchemaLoader.java:613)
at
com.sun.org.apache.xerces.internal.impl.xs.XMLSchemaLoader.loadGrammar(XMLSchemaLoader.java:572)
at
com.sun.org.apache.xerces.internal.impl.xs.XMLSchemaLoader.loadGrammar(XMLSchemaLoader.java:538)
at
com.sun.org.apache.xerces.internal.jaxp.validation.XMLSchemaFactory.newSchema(XMLSchemaFactory.java:255)
at
org.mule.module.xml.internal.operation.SchemaValidatorOperation$2.create(SchemaValidatorOperation.java:142)
at
org.mule.module.xml.internal.operation.SchemaValidatorOperation$2.create(SchemaValidatorOperation.java:132)
at
org.apache.commons.pool2.BasePooledObjectFactory.makeObject(BasePooledObjectFactory.java:58)
at
org.apache.commons.pool2.impl.GenericObjectPool.create(GenericObjectPool.java:888)
at
org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:432)
Any suggestions to resolve this please?
I tried adding -Djavax.xml.accessExternalSchema=all in VM arguments while running on local still it remains same
<flow name="mytestingFlow" doc:id="4efe5074-da20-4164-843a-06ca9a2a9979" >
<http:listener doc:name="Listener" doc:id="74f4f199-00cb-460d-b72f-df3497f26e6a" config-ref="HTTP_Listener_config" path="/service/path/one"/>
<set-payload value="#["<MPItemFeed xmlns=\"http://walmart.com/\"><MPItemFeedHeader><version>3.2</version></MPItemFeedHeader><MPItem><processMode>CREATE</processMode><sku>10145802</sku><productIdentifiers><productIdentifier><productIdType>UPC</productIdType><productId>123456789123</productId></productIdentifier></productIdentifiers><MPProduct><SkuUpdate>NO</SkuUpdate><msrp>183.99</msrp><productName>CARQUEST Platinum Professional Ceramic Brake Pads - Front (4-Pad Set)</productName><ProductIdUpdate>YES</ProductIdUpdate><category><Vehicle><VehiclePartsAndAccessories><shortDescription>Ceramic Brake Pads - Front (4-Pad Set)</shortDescription><keyFeatures><keyFeaturesValue>Premium brake pad underlayer reduces vibration for silent braking Industry leading number of application specific formulations for maximum performance Revolutionary burnishing compound strip allows for proper break-in of pads and rotors.</keyFeaturesValue></keyFeatures><brand>CARQUEST Platinum Professional</brand><manufacturer>CARQUEST Platinum Professional</manufacturer><manufacturerPartNumber>PXD1210H</manufacturerPartNumber><mainImageUrl>http://pdfifsvcprd.corp.advancestores.com/assets/epc50x50/std.lang.all/1012147531.jpg</mainImageUrl><isProp65WarningRequired>Yes</isProp65WarningRequired><prop65WarningText>cancer and reproductive</prop65WarningText><hasWarranty>YES</hasWarranty><warrantyText>LIMITED LIFETIME REPLACEMENT</warrantyText></VehiclePartsAndAccessories></Vehicle></category></MPProduct><MPOffer><price>182.99</price><ShippingWeight><measure>4</measure><unit>lb</unit></ShippingWeight><ProductTaxCode>2038710</ProductTaxCode></MPOffer></MPItem><MPItem><processMode>CREATE</processMode><sku>11395545</sku><productIdentifiers><productIdentifier><productIdType>UPC</productIdType><productId>123456789123</productId></productIdentifier></productIdentifiers><MPProduct><SkuUpdate>NO</SkuUpdate><msrp>183.99</msrp><productName>CARQUEST Platinum Brake Rotor - Front</productName><ProductIdUpdate>YES</ProductIdUpdate><category><Vehicle><VehiclePartsAndAccessories><shortDescription>Brake Rotor - Front</shortDescription><keyFeatures><keyFeaturesValue>Engineered to withstand 120 hours of salt spray testing Manufactured to exacting quality and dimensional specifications for Superior Stopping Power Exceeds ISO manufacturing guidelines (International Organization for Standardization)</keyFeaturesValue></keyFeatures><brand>CARQUEST Platinum</brand><manufacturer>CARQUEST Platinum</manufacturer><manufacturerPartNumber>YH145232P</manufacturerPartNumber><mainImageUrl>http://pdfifsvcprd.corp.advancestores.com/assets/epc50x50/std.lang.all/1017931756.jpg</mainImageUrl><isProp65WarningRequired>No</isProp65WarningRequired><prop65WarningText/><hasWarranty>YES</hasWarranty><warrantyText>2 YR REPLACEMENT IF DEFECTIVE</warrantyText></VehiclePartsAndAccessories></Vehicle></category></MPProduct><MPOffer><price>182.99</price><ShippingWeight><measure>4</measure><unit>lb</unit></ShippingWeight><ProductTaxCode>2038710</ProductTaxCode></MPOffer></MPItem><MPItem><processMode>CREATE</processMode><sku>10556036</sku><productIdentifiers><productIdentifier><productIdType>UPC</productIdType><productId>123456789123</productId></productIdentifier></productIdentifiers><MPProduct><SkuUpdate>NO</SkuUpdate><msrp>183.99</msrp><productName>CARQUEST Premium Lube Element with Lid</productName><ProductIdUpdate>YES</ProductIdUpdate><category><Vehicle><VehiclePartsAndAccessories><shortDescription>Lube Element with Lid</shortDescription><keyFeatures><keyFeaturesValue>Environmental cartridge lube filter High efficiency and durable cellulose/synthetic blended media for longer drain intervals Silicone anti-drain back valve has 3X the durability verses nitrile for engine start-up protection</keyFeaturesValue></keyFeatures><brand>CARQUEST Premium</brand><manufacturer>CARQUEST Premium</manufacturer><manufacturerPartNumber>84312</manufacturerPartNumber><mainImageUrl>http://pdfifsvcprd.corp.advancestores.com/assets/epc50x50/std.lang.all/1015772990.jpg</mainImageUrl><isProp65WarningRequired>Yes</isProp65WarningRequired><prop65WarningText>cancer and reproductive</prop65WarningText><hasWarranty>YES</hasWarranty><warrantyText>REPLACE OR REFUND AT MGR DISCRETION</warrantyText></VehiclePartsAndAccessories></Vehicle></category></MPProduct><MPOffer><price>182.99</price><ShippingWeight><measure>4</measure><unit>lb</unit></ShippingWeight><ProductTaxCode>2038710</ProductTaxCode></MPOffer></MPItem><MPItem><processMode>CREATE</processMode><sku>20471798</sku><productIdentifiers><productIdentifier><productIdType>UPC</productIdType><productId>123456789123</productId></productIdentifier></productIdentifiers><MPProduct><SkuUpdate>NO</SkuUpdate><msrp>183.99</msrp><productName>Denso Air-Fuel Ratio Sensor 4 Wire, Direct Fit, Heated, Wire Length: 10.63</productName><ProductIdUpdate>YES</ProductIdUpdate><category><Vehicle><VehiclePartsAndAccessories><shortDescription>Air-Fuel Ratio Sensor 4 Wire, Direct Fit, Heated, Wire Length: 10.63</shortDescription><keyFeatures><keyFeaturesValue>Specifically designed to meet the increasing demands of today's engines 100% checked for high temperature signal output, air tightness, continuity, and heat resistance for optimal efficiency and performance Double protection cover helps maintain proper unit temperature for quicker response times, which is critical to your vehicle's fuel efficiency</keyFeaturesValue></keyFeatures><brand>Denso</brand><manufacturer>Denso</manufacturer><manufacturerPartNumber>234-9001</manufacturerPartNumber><mainImageUrl>http://pdfifsvcprd.corp.advancestores.com/assets/epc50x50/std.lang.all/524891.jpg</mainImageUrl><isProp65WarningRequired>No</isProp65WarningRequired><prop65WarningText/><hasWarranty>YES</hasWarranty><warrantyText>1 YR REPLACEMENT IF DEFECTIVE</warrantyText></VehiclePartsAndAccessories></Vehicle></category></MPProduct><MPOffer><price>182.99</price><ShippingWeight><measure>4</measure><unit>lb</unit></ShippingWeight><ProductTaxCode>2038710</ProductTaxCode></MPOffer></MPItem></MPItemFeed>
"]" doc:name="Set Payload" doc:id="b16a5f07-fa58-4c36-837a-3533eecdcccd" mimeType="application/xml"/>
<ee:transform doc:name="Transform Message" doc:id="a6d22ddb-1e3a-4519-a22f-41987f9b5049" >
<ee:message >
<ee:set-payload ><![CDATA[%dw 2.0
output application/xml
---
payload]]></ee:set-payload>
</ee:message>
</ee:transform>
<xml-module:validate-schema doc:name="Validate schema" doc:id="c5dbaef9-d4f3-4aeb-b15a-516aa0eb2479" schemas="Animal.xsd,ArtAndCraftCategory.xsd,Baby.xsd,CarriersAndAccessoriesCategory.xsd,ClothingCategory.xsd,Electronics.xsd,FoodAndBeverageCategory.xsd,FootwearCategory.xsd,FurnitureCategory.xsd,GardenAndPatioCategory.xsd,HealthAndBeauty.xsd,Home.xsd,JewelryCategory.xsd,Media.xsd,MPCommons.xsd,MPItem.xsd,MPItemFeed.xsd,MPItemFeedHeader.xsd,MPOffer.xsd,MPProduct.xsd,MusicalInstrument.xsd,OccasionAndSeasonal.xsd,OfficeCategory.xsd,OtherCategory.xsd,Photography.xsd,SportAndRecreation.xsd,ToolsAndHardware.xsd,ToysCategory.xsd,Vehicle.xsd,WatchesCategory.xsd" config-ref="XML_Config"/>
<logger level="INFO" doc:name="Logger" doc:id="a74dbf49-d111-4eb3-84c3-598845ecaf48" />
</flow>

What is the location in which the XSD files are present? The path needs to be specified for the XSD as mentioned here
Reading file is the problem. Similar issue is faced while reading mUnit input file also. In these cases, files can be read as : getResourceAsString or getResourceAsStream
This should be helpful.

Related

Anypoint MQ messages remains in in-flight and not been processed

I have a flow which submits around 10-20 salesforce bulk query job details to anypoint mq to be processed asynchronously.
I am using normal Queue, Not using FIFO queue and wants process one message at a time.
My subscriber configurations are given below. I am putting this whooping ack timeout to 15 minutes as max it has taken 15 minutes for a Job to change the status from jobUpload to JobCompleted.
MuleRuntime: 4.4
MQ Connector Version: 3.2.0
<anypoint-mq:subscriber doc:name="Subscribering Bulk Query Job Details"
config-ref="Anypoint_MQ_Config"
destination="${anyPointMq.name}"
acknowledgementTimeout="15"
acknowledgementTimeoutUnit="MINUTES">
<anypoint-mq:subscriber-type >
<anypoint-mq:prefetch maxLocalMessages="1" />
</anypoint-mq:subscriber-type>
</anypoint-mq:subscriber>
Anypoint MQ Connector Configuration
<anypoint-mq:config name="Anypoint_MQ_Config" doc:name="Anypoint MQ Config" doc:id="ce3aaed9-dcba-41bc-8c68-037c5b1420e2">
<anypoint-mq:connection clientId="${secure::anyPointMq.clientId}" clientSecret="${secure::anyPointMq.clientSecret}" url="${anyPointMq.url}">
<reconnection>
<reconnect frequency="3000" count="3" />
</reconnection>
<anypoint-mq:tcp-client-socket-properties connectionTimeout="30000" />
</anypoint-mq:connection>
</anypoint-mq:config>
Subscriber flow
<flow name="sfdc-bulk-query-job-subscription" doc:id="7e1e23d0-d7f1-45ed-a609-0fb35dd23e6a" maxConcurrency="1">
<anypoint-mq:subscriber doc:name="Subscribering Bulk Query Job Details" doc:id="98b8b25e-3141-4bd7-a9ab-86548902196a" config-ref="Anypoint_MQ_Config" destination="${anyPointMq.sfPartnerEds.name}" acknowledgementTimeout="${anyPointMq.ackTimeout}" acknowledgementTimeoutUnit="MINUTES">
<anypoint-mq:subscriber-type >
<anypoint-mq:prefetch maxLocalMessages="${anyPointMq.prefecth.maxLocalMsg}" />
</anypoint-mq:subscriber-type>
</anypoint-mq:subscriber>
<json-logger:logger doc:name="INFO - Bulk Job Details have been fetched" doc:id="b25c3850-8185-42be-a293-659ebff546d7" config-ref="JSON_Logger_Config" message='#["Bulk Job Details have been fetched for " ++ payload.object default ""]'>
<json-logger:content ><![CDATA[#[output application/json ---
payload]]]></json-logger:content>
</json-logger:logger>
<set-variable value="#[p('serviceName.sfdcToEds')]" doc:name="ServiceName" doc:id="f1ece944-0ed8-4c0e-94f2-3152956a2736" variableName="ServiceName"/>
<set-variable value="#[payload.object]" doc:name="sfObject" doc:id="2857c8d9-fe8d-46fa-8774-0eed91e3a3a6" variableName="sfObject" />
<set-variable value="#[message.attributes.properties.key]" doc:name="key" doc:id="57028932-04ab-44c0-bd15-befc850946ec" variableName="key" />
<flow-ref doc:name="bulk-job-status-check" doc:id="c6b9cd40-4674-47b8-afaa-0f789ccff657" name="bulk-job-status-check" />
<json-logger:logger doc:name="INFO - subscribed bulk job id has been processed successfully" doc:id="7e469f92-2aff-4bf4-84d0-76577d44479a" config-ref="JSON_Logger_Config" message='#["subscribed bulk job id has been processed successfully for salesforce " ++ vars.sfObject default "" ++ " object"]' tracePoint="END"/>
</flow>
After the bulk query job subscriber, I am checking the status of the job for 5 time with an interval of 1 minutes inside until successful scope. It generally exhausts all 5 attempts and subscribe it again and do the same process again until it gets completed. I have seen until successfull scope gets exhausted more than one for a single job.
Once the job's status changes to jobComplete. I fetch the result and sends to AWS S3 bucket via mulesoft system api. Here also I use a retry logic as due to large volume of data I always get this message while making first call
HTTP POST on resource 'https://****//dlb.lb.anypointdns.net:443/api/sys/aws/s3/databricks/object' failed: Remotely closed.
But during the second retry it gets successful response from S3 Bucket system api.
Now the main problem:
Though I am using normal queue. I have notice messages remains in flight mode for infinite amount of time and still not get picket up by mule flow/subscriber. Below screenshot shows an example, there were 7 messages in flight but were not being picked up even after many days.
As I have kept maxConcurrency and maxPrefetchLocalMsg to 1. But there are more than 1 messages are been taken out of the queue. Please help understand this.

Issues Migrating Mule 3.6 expression component into DataWeave 2.0

My company is moving toward migrating our current Mule 3.6 APIs into Mule 4.2 and I'm trying to migrate our first API at present. There are numerous differences between the runtimes not least the wide use of Dataweave 2.0 in Mule 4. I'm new to a lot of the components of Mule 4 having used Mule 3 extensively, and I'm currently stuck with moving the following expression component into Dataweave. I'm struggling to get the correct syntax without Studio complaining that there are errors.
The current expression is
<expression-component doc:name="Expression"><![CDATA[
flowVars.pSector="ELECTRICITY";
if(flowVars.serialNo.length()==14){
if (flowVars.serialNo.substring(0,2)=="G4" || flowVars.serialNo.substring(0,2)=="E6" ||
flowVars.serialNo.substring(0,2)=="JE" || flowVars.serialNo.substring(0,2)=="JA" ||
flowVars.serialNo.substring(0,2)=="JS") {
flowVars.pSector="GAS";
}
}]]></expression-component>
this is essentially determining the fuel type of a meter based on component parts of its serial number and it's length. Any help on converting this expression into a Dataweave would be appreciated
Note that in Mule 4 there can not be side effects, meaning that you can assign the result of one script to the payload or to one variable. Also DataWeave is functional rather than an imperative language.
In Mule 4 variables are referenced as vars.name instead of flowVars.name.
A naive translation could be like this:
<ee:transform doc:name="Transform Message">
<ee:message >
<ee:set-payload ><![CDATA[%dw 2.0
output application/java
fun checkSerial(serial)=if (sizeOf(serial) == 14 )
if (serial[0 to 1] == "G4" or serial[0 to 1]=="E6" or serial[0 to 1]=="JE"
or serial[0 to 1]=="JA" or serial[0 to 1]=="JS")
"GAS"
else
"ELECTRICITY"
else "ELECTRICITY"
---
checkSerial(vars.serialNo)]]></ee:set-payload>
</ee:message>
</ee:transform>

Mule Batch Max Failures not working

I'm using mule batch flow to process the files. As per the requirement I should stop processing the batch step for further processing after 10 failures.
So I've configured max-failed-records="10" but still I see around 99 failures in my logger that is kept in complete phase. The file which the app recieves will have around 8657 rows. so loaded records will be 8657 records.
Logger in complete phase:
<logger message="#['Failed Records'+payload.failedRecords]" level="INFO" doc:name="Logger"/>
Below image is my flow:
Its default behavior of the mule. As per Batch Documentation Mule loads 1600 records at once (16 threads x 100 records per block). Though max failure is set 10 it will process all loaded records, but it wont load next record blocks as max failure limit is reached.
Hope this helps.

SAP Pool capacity and peak limit-Mule esb

Hi am having an SAP connector and SAP is sending 10,000 idocs parallel what can be the good number I can give in Pooling capacity and Peak limit,Now I have given 2000 and 1000 respectively any suggestion for better performance
<sap:connector name="SAP" jcoAsHost="${saphost}" jcoUser="${sapuser}" jcoPasswd="${sappassword}" jcoSysnr="${sapsystemnumber}" jcoClient="${sapclient}" jcoLang="${saploginlanguage}" validateConnections="true" doc:name="SAP" jcoPeakLimit="1000" jcoPoolCapacity="2000"/>
When dealing with big amounts of IDocs, the first thing I recommend to improve performance is to configure the SAP system to send IDocs in batches instead of sending them individually. That is, the IDocs are sent in groups of X number you define as a batch size in the Partner Profile section of SAPGUI. Among other settings, you have to set "Pack. Size" and select "Collect IDocs" as Output Mode.
If you are not familiar with SAPGUI, request your SAP Admin to configure it for you.
Aditionally, to extract the most out of the connection pooling from the SAP connector, I suggest you use SAP Client Extended Properties to get full advantage of JCo additional connection parameters. These extended properties are defined in a Spring Bean and set in jcoClientExtendedProperties at connector or endpoint level. Take a look at the following example:
<spring:beans>
<spring:bean name="sapClientProperties" class="java.util.HashMap">
<spring:constructor-arg>
<spring:map>
<!-- Maximum number of active connections that can be created for a destination simultaneously -->
<spring:entry key="jco.destination.peak_limit" value="15"/>
<!-- Maximum number of idle connections kept open by the destination. A value of 0 has the effect that there is no connection pooling, i.e. connections will be closed after each request. -->
<spring:entry key="jco.destination.pool_capacity" value="10"/>
<!-- Time in ms after that the connections hold by the internal pool can be closed -->
<spring:entry key="jco.destination.expiration_time" value="300000"/>
<!-- Interval in ms with which the timeout checker thread checks the connections in the pool for expiration -->
<spring:entry key="jco.destination.expiration_check_period" value="60000"/>
<!-- Max time in ms to wait for a connection, if the max allowed number of connections is allocated by the application -->
<spring:entry key="jco.destination.max_get_client_time" value="30000"/>
</spring:map>
</spring:constructor-arg>
</spring:bean>
</spring:beans>
<sap:connector name="SAP"
jcoAsHost="${sap.jcoAsHost}"
jcoUser="${sap.jcoUser}"
jcoPasswd="${sap.jcoPasswd}"
jcoSysnr="${sap.jcoSysnr}"
jcoClient="${sap.jcoClient}"
...
jcoClientExtendedProperties-ref="sapClientProperties" />
Important: to enable the pool it is mandatory that you set the jcoExpirationTime additionally to the Peak Limit and Pool Capacity.
For further details, please refer to SAP Connector Advanced Features documentation.

Mule batch commit and records failures

My current scenario:
I have 10000 records as input to batch.
As per my understanding, batch is only for record-by-record processing.Hence, i am transforming each record using dataweave component inside batch step(Note: I havenot used any batch-commit) and writing each record to file. The reason for doing record-by-record processing is suppose in any particular record, there is an invalid data, only that particular record gets failed, rest of them will be processed fine.
But in many of the blogs I see, they are using a batchcommit(with streaming) with dataweave component. So as per my understanding, all the records will be given in one shot to dataweave, and if one record has invalid data, all the 10000 records will get failed(at dataweave). Then, the point of record-by-record processing is lost.
Is the above assumption correct or am I thinking wrong way??
That is the reason I am not using batch Commit.
Now, as I said am sending each record to a file. Actually, i do have the requirement of sending each record to 5 different CSV files. So, currently I am using Scatter-Gather component inside my BatchStep to send it to five different routes.
As, you can see the image. the input phase gives a collection of 10000 records. Each record will be send to 5 routes using Scatter-Gather.
Is, the approach I am using is it fine, or any better Design can be followed??
Also, I have created a 2nd Batch step, to capture ONLY FAILEDRECORDS. But, with the current Design, I am not able to Capture failed records.
SHORT ANSWERS
Is the above assumption correct or am I thinking wrong way??
In short, yes you are thinking the wrong way. Read my loooong explanation with example to understand why, hope you will appreciate it.
Also, I have created a 2nd Batch step, to capture ONLY FAILEDRECORDS.
But, with the current Design, I am not able to Capture failed records.
You probably forget to set max-failed-records = "-1" (unlimited) on batch job. Default is 0, on first failed record batch will return and not execute subsequent steps.
Is, the approach I am using is it fine, or any better Design can be
followed??
I think it makes sense if performance is essential for you and you can't cope with the overhead created by doing this operation in sequence.
If instead you can slow down a bit it could make sense to do this operation in 5 different steps, you will loose parallelism but you can have a better control on failing records especially if using batch commit.
MULE BATCH JOB IN PRACTICE
I think the best way to explain how it works it trough an example.
Take in consideration the following case:
You have a batch processing configured with max-failed-records = "-1" (no limit).
<batch:job name="batch_testBatch" max-failed-records="-1">
In this process we input a collection composed by 6 strings.
<batch:input>
<set-payload value="#[['record1','record2','record3','record4','record5','record6']]" doc:name="Set Payload"/>
</batch:input>
The processing is composed by 3 steps"
The first step is just a logging of the processing and the second step will instead do a logging and throw an exception on record3 to simulate a failure.
<batch:step name="Batch_Step">
<logger message="-- processing #[payload] in step 1 --" level="INFO" doc:name="Logger"/>
</batch:step>
<batch:step name="Batch_Step2">
<logger message="-- processing #[payload] in step 2 --" level="INFO" doc:name="Logger"/>
<scripting:transformer doc:name="Groovy">
<scripting:script engine="Groovy"><![CDATA[
if(payload=="record3"){
throw new java.lang.Exception();
}
payload;
]]>
</scripting:script>
</scripting:transformer>
</batch:step>
The third step will instead contain just the commit with a commit count value of 2.
<batch:step name="Batch_Step3">
<batch:commit size="2" doc:name="Batch Commit">
<logger message="-- committing #[payload] --" level="INFO" doc:name="Logger"/>
</batch:commit>
</batch:step>
Now you can follow me in the execution of this batch processing:
On start all 6 records will be processed by the first step and logging in console would look like this:
-- processing record1 in step 1 --
-- processing record2 in step 1 --
-- processing record3 in step 1 --
-- processing record4 in step 1 --
-- processing record5 in step 1 --
-- processing record6 in step 1 --
Step Batch_Step finished processing all records for instance d8660590-ca74-11e5-ab57-6cd020524153 of job batch_testBatch
Now things would be more interesting on step 2 the record 3 will fail because we explicitly throw an exception but despite this the step will continue in processing the other records, here how the log would look like.
-- processing record1 in step 2 --
-- processing record2 in step 2 --
-- processing record3 in step 2 --
com.mulesoft.module.batch.DefaultBatchStep: Found exception processing record on step ...
Stacktrace
....
-- processing record4 in step 2 --
-- processing record5 in step 2 --
-- processing record6 in step 2 --
Step Batch_Step2 finished processing all records for instance d8660590-ca74-11e5-ab57-6cd020524153 of job batch_testBatch
At this point despite a failed record in this step batch processing will continue because the parameter max-failed-records is set to -1 (unlimited) and not to the default value of 0.
At this point all the successful records will be passed to step3, this because, by default, the accept-policy parameter of a step is set to NO_FAILURES. (Other possible values are ALL and ONLY_FAILURES).
Now the step3 that contains the commit phase with a count equal to 2 will commit the records two by two:
-- committing [record1, record2] --
-- committing [record4, record5] --
Step: Step Batch_Step3 finished processing all records for instance d8660590-ca74-11e5-ab57-6cd020524153 of job batch_testBatch
-- committing [record6] --
As you can see this confirms that record3 that was in failure was not passed to the next step and therefore not committed.
Starting from this example I think you can imagine and test more complex scenario, for example after commit you could have another step that process only failed records for make aware administrator with a mail of the failure.
After you can always use external storage to store more advanced info about your records as you can read in my answer to this other question.
Hope this helps