How to force file-queue-store to keep messages on disk - mule

For asynchronous processing large amount of files it could be nice to store messages in a persistent storage to releave JVM heap and avoid data loss in case of system failure.
I configured file-queue-store, but unfortunatelly, I can not see msg files in the .mule/queuestore/myqueuename folder.
I feed the flow with files from smb:endpoint and send them to a cxf endpoint.
When I stop Mule ESB (version 3.2.0) properly during file processing, it writes a lot of .msg files to the queuestore. After restart it processes them one-by-one.
But, when I kill the JVM (to test a system failure, or OutOfMemoryError, etc.), there is no fies in the queuestore, so all of the is lost.
My question: Is it possible to force queuestore to store the messages on disk and delete them only when they fully processed?
Please advise. Thanks in advance.

Mule 3.2.0 was affected by this issue
You should consider upgrading.

Related

Mule 4 Runtime - Clear Application Data of a batch process

I was having intermittent issue running a Mule Batch with huge data in Anypoint Studio. That issue is resolved by enabling 'Always' option under 'Clear Application Data' in 'Run Configurations' (as per the given instruction in Mule ESB - Clear Memory of a batch process). That option is shown in the picture.
How to enable the same 'Always' option in stand alone Mule Runtime during the startup that means when we are not running the batch from Anypoint Studio? Is there any command line argument available that can be used in startup script of the Mule Runtime to achieve the same goal?
By deleting the local data you are deleting batch queues, persistent objects stores and maybe some other information. In a development environment like Anypoint Studio IDE it is usually OK but for a standalone Mule Runtime it means you are deleting production data, for example records that are used by batch to continue processing after a restart. That data will be lost. Having said that, it might be needed if the data is completely corrupted.
It is a best practice and a strong advice to any user to resolve the root cause of the issue rather than delete data. And it should never be done every time you start your production Mule, only when there is absolutely no other alternative.
I don't recommend to delete local files at all. If even after my warnings you absolutely need to do this never ever delete the .mule directory. If you still want to risk losing data delete only the directory with the application name under the .mule directory.

Unable to execute HTTP request: Timeout waiting for connection from pool in Flink

I'm working on an app which uploads some files to an s3 bucket and at a later point, it reads files from s3 bucket and pushes it to my database.
I'm using Flink 1.4.2 and fs.s3a API for reading and write files from the s3 bucket.
Uploading files to s3 bucket works fine without any problem but when the second phase of my app that is reading those uploaded files from s3 starts, my app is throwing following error:
Caused by: java.io.InterruptedIOException: Reopen at position 0 on s3a://myfilepath/a/b/d/4: org.apache.flink.fs.s3hadoop.shaded.com.amazonaws.SdkClientException: Unable to execute HTTP request: Timeout waiting for connection from pool
at org.apache.flink.fs.s3hadoop.shaded.org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:125)
at org.apache.flink.fs.s3hadoop.shaded.org.apache.hadoop.fs.s3a.S3AInputStream.reopen(S3AInputStream.java:155)
at org.apache.flink.fs.s3hadoop.shaded.org.apache.hadoop.fs.s3a.S3AInputStream.lazySeek(S3AInputStream.java:281)
at org.apache.flink.fs.s3hadoop.shaded.org.apache.hadoop.fs.s3a.S3AInputStream.read(S3AInputStream.java:364)
at java.io.DataInputStream.read(DataInputStream.java:149)
at org.apache.flink.fs.s3hadoop.shaded.org.apache.flink.runtime.fs.hdfs.HadoopDataInputStream.read(HadoopDataInputStream.java:94)
at org.apache.flink.api.common.io.DelimitedInputFormat.fillBuffer(DelimitedInputFormat.java:702)
at org.apache.flink.api.common.io.DelimitedInputFormat.open(DelimitedInputFormat.java:490)
at org.apache.flink.api.common.io.GenericCsvInputFormat.open(GenericCsvInputFormat.java:301)
at org.apache.flink.api.java.io.CsvInputFormat.open(CsvInputFormat.java:53)
at org.apache.flink.api.java.io.PojoCsvInputFormat.open(PojoCsvInputFormat.java:160)
at org.apache.flink.api.java.io.PojoCsvInputFormat.open(PojoCsvInputFormat.java:37)
at org.apache.flink.runtime.operators.DataSourceTask.invoke(DataSourceTask.java:145)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:718)
at java.lang.Thread.run(Thread.java:748)
I was able to control this error by increasing the max connection parameter for s3a API.
As of now, I have around 1000 files in the s3 bucket which is pushed and pulled by my app in the s3 bucket and my max connection is 3000. I'm using Flink's parallelism to upload/download these files from s3 bucket. My task manager count is 14.
This is an intermittent failure, I'm having success cases also for this scenario.
My query is,
Why I'm getting an intermittent failure? If the max connection I set was low, then my app should be throwing this error every time I run.
Is there any way to calculate the optimal number of max connection required for my app to work without facing the connection pool timeout error? Or Is this error related to something else that I'm not aware of?
Thanks
In Advance
Some comments, based on my experience with processing lots of files from S3 via Flink (batch) workflows:
When you are reading the files, Flink will calculate "splits" based on the number of files, and each file's size. Each split is read separately, so the theoretical max # of simultaneous connections isn't based on the # of files, but a combination of files and file sizes.
The connection pool used by the HTTP client releases connections after some amount of time, as being able to reuse an existing connection is a win (server/client handshake doesn't have to happen). So that introduces a degree of randomness into how many available connections are in the pool.
The size of the connection pool doesn't impact memory much, so I typically set it pretty high (e.g. 4096 for a recent workflow).
When using AWS connection code, the setting to bump is fs.s3.maxConnections, which isn't the same as a pure Hadoop configuration.

Weblogic 10.3.6 generates empty heapdump on OutOfMemoryError

I'm trying to generate a full heapdump from Weblogic 10.3.6 due to an OutOfMemoryError generated by a Web Application deployed on the Server.
I've setted the following start script:
-XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/path/to/heapdump
When the OutOfMemoryError occurs, Weblogic generates an empty hprof file (0 bytes size) in /path/to/heapdump folder, and nothing happens: the Server remains in RUNNING mode, even if is not reachable anymore.
The java process is still alive, but with 0% of processor.
Even the server.out log seems completely frozen, without any trace of the OutOfMemoryError.
What's wrong with the configuration?
Probably you can use Java Flight Recorder to save events and check which objects are generating OOM.
(any profiler should work as well).
Been there :( . I remember at the time that we've found it was somewhat logical since there was not enough memory for normal operation, the JVM could not automagically find enough memory to create a heapdump either. If memory serves me well, at that time we did 2 things to debug the memory leak. First we were "lucky" enough that the problem was happening fairly regularly so a close manual monitoring was possible (monitoring of the gc.log looking for repeated FullGC and monitoring of the performance tab in the console). Knowing when the onset of the problem was starting we were doing some kill -3 to get the dump manually. We also used jstack {PID} (JDK 1.6 on Linux) with some luck. With those, at the time, the devs were able to identify the memory leak. Hope that helps.
Okay, your configuration looks alright.. you might want to check if the weblogic process user has the rights to edit the heap dump file.
You can take heap dump by Java tools :
JAVA_HOME/bin/jmap -dump:format=b,file=path_of_the_file
OR
%JROCKIT_HOME%\bin\jrcmd hprofdump filename=path_of_the_file

OOM Exception with MTOM client

I am working on transfer large size file, and finally ended with MTOM implementation. we created MTOM enabled web service and client, and tested the client as a plain Java program. and we were able to send 1 GB file successfully. the main point here the heap at client place were not even increasing more than 70 MB.
But when I tried to initiate the same call from web-logic container (means created web client), we end up with below OOM Exception.
at
weblogic.utils.io.UnsyncByteArrayOutputStream.resizeBuffer(UnsyncByteArrayOutputStream.java:59)
at weblogic.utils.io.UnsyncByteArrayOutputStream.write(UnsyncByteArrayOutputStream.java:89)
at javax.activation.DataHandler.writeTo(DataHandler.java:293)
at com.sun.xml.ws.encoding.MtomCodec$ByteArrayBuffer.write(MtomCodec.java:196)
at com.sun.xml.ws.encoding.MtomCodec.encode(MtomCodec.java:163)
at com.sun.xml.ws.encoding.SOAPBindingCodec.encode(SOAPBindingCodec.java:258)
at com.sun.xml.ws.transport.http.client.HttpTransportPipe.process(HttpTransportPipe.java:142)
at com.sun.xml.ws.transport.http.client.HttpTransportPipe.processRequest(HttpTransportPipe.java:86)
at com.sun.xml.ws.api.pipe.Fiber.__doRun(Fiber.java:598)
at com.sun.xml.ws.api.pipe.Fiber._doRun(Fiber.java:557)
at com.sun.xml.ws.api.pipe.Fiber.doRun(Fiber.java:542)
at com.sun.xml.ws.api.pipe.Fiber.runSync(Fiber.java:439)
at com.sun.xml.ws.client.Stub.process(Stub.java:248)
at com.sun.xml.ws.client.sei.SEIStub.doProcess(SEIStub.java:135)
at com.sun.xml.ws.client.sei.SyncMethodHandler.invoke(SyncMethodHandler.java:109)
at com.sun.xml.ws.client.sei.SyncMethodHandler.invoke(SyncMethodHandler.java:89)
at com.sun.xml.ws.client.sei.SEIStub.invoke(SEIStub.java:118)
at $Proxy101.uploadFile(Unknown Source)
anyu one have any idea
UPDATE: it seems the MTOM settings are not effective when we run the program in web-logic container ! but still I am not able to find the solution
UPDATE 2: it seems weblogic is not supporting streaming ! I will update the weblogic version and update the ticket, till them wish me luck..
Add this additional Java/JVM Option in setDomainEnv.sh
EXTRA_JAVA_PROPERTIES="-DUseSunHttpHandler=true ${EXTRA_JAVA_PROPERTIES}"
export EXTRA_JAVA_PROPERTIES
switches from weblogic specific (weblogic.net.http.HttpURLConnection) to sun's HTTP handler.
This solved my issue.
Refer:
Changing HttpURLConnection in running jvm
http://atgtipsandtweaks.blogspot.com/2011/11/weblogicjava-httphandler-issues.html
Thanks!

mule sftp archive not continuing flow

Im having major issues with Mule 3 and files being read and that later should be put on some standard queue on ActiveMQ.
basically its a really simple service, initially started that on inbound starts off by
This file is read correctly from the SFTP area, and in the mule log for the reading application its stated that the file is written to the specified archiveDir..
After this, its silent, nothing else happens... the file is just placed in the archiveDir and neithe ActiveMQ or Mule3 gives any indications to that something have gone wrong...
The queue names etc etc is all correct.
Basically the same environment is running on a second server, with no disturbance..
Is there any commonly known issues that could make mule not continue on with its processing putting the file on queue?
Thx in advance!