Apache Camel Read only First Line - apache

I have a requirement where in i have to push files on directories dynamically based on their contents.
The information related to the directory is available in the first line of the File.
As the files are very large in size, loading the entire contents of the file will not be suitable.
Also i want to skip the rest of the file once the first line is read. Following is the code that i have written
from("file:D:\\camel\\input?recursive=true&delete=true")
.split().tokenize("'",1)
.process(new CustomProcessor())
.to("file:D:\\camel\\output\\${header.foldername}");
The issue with the approach is that camel parses the entire file. Also the destination gets only the line that is being tokenized at the output folder rather than the entire file contents.
Please assist

thanks #claus and #Souciance for your feedback. Actually i had another challenge many of the files i received did not have '\n' or '\r' as delimiter hence even reading a single like could be like reading the entire file. I implemented the solution using the scanner with delimiter as follows.
My Router is defined as follows
#Override
public void configure() throws Exception {
from("file:D:\\camel\\input?recursive=true&delete=true")
.process(new CustomProcessor())
.recipientList(header("foldername"));
}
Processor Code is
#Override
public void process(Exchange exchange) throws Exception {
File data = exchange.getIn().getBody(File.class);
Scanner sc = new Scanner(data,"UTF-8");
sc.useDelimiter("'");
String folderPath="";
while (sc.hasNext()){
String line = (String) sc.next();
//business logic
break;
}
sc.close();
String destDir = "file:D:\\camel\\output\\"+folderPath;
exchange.getIn().setHeader("foldername",destDir);
}

Related

How to write/serialize lucene's ByteBuffersDirectory to disk?

How one would write a Lucene 8.11 ByteBuffersDirectory to disk?
something similar to Lucene 2.9.4 Directory.copy(directory, FSDirectory.open(indexPath), true)
You can use the copyFrom method to do this.
For example:
You are using a ByteBuffersDirectory:
final Directory dir = new ByteBuffersDirectory();
Assuming you are not concurrently writing any new data to that dir, you can declare a target where you want to write the data - for example, a FSDirectory (a file system directory):
Directory to = FSDirectory.open(Paths.get(OUT_DIR_PATH));
Use whatever string you want for the OUT_DIR_PATH location.
Then you can iterate over all the files in the original dir object, writing them to this new to location:
IOContext ctx = new IOContext();
for (String file : dir.listAll()) {
System.out.println(file); // just for testing
to.copyFrom(dir, file, file, ctx);
}
This will create the new OUT_DIR_PATH dir and populate it with files, such as:
_0.cfe
_0.cfs
_0.si
segments_1
... or whatever files you happen to have in your dir.
Caveat:
I have only used this with a default IOContext object. There are other constructors for the context - not sure what they do. I assume they give you more control over how the write is performed.
Meanwhile I figured it out by myself and created a straight forward method for it:
#SneakyThrows
public static void copyIndex(ByteBuffersDirectory ramDirectory, Path destination) {
FSDirectory fsDirectory = FSDirectory.open(destination);
Arrays.stream(ramDirectory.listAll())
.forEach(fileName -> {
try {
// IOContext is null because in fact is not used (at least for the moment)
fsDirectory.copyFrom(ramDirectory, fileName, fileName, null);
} catch (IOException e) {
log.error(e.getMessage(), e);
}
});
}

How to merge 10000 pdf into one using pdfbox in most effective way

PDFBox api is working fine for less number of files. But i need to merge 10000 pdf files into one, and when i pass 10000 files(about 5gb) it's taking 5gb ram and finally goes out of memory.
Is there some implementation for such requirement in PDFBox.
I tried to tune it for that i used AutoClosedInputStream which gets closed automatically after read, But output is still same.
I have a similar scenario here, but I need to merge only 1000 documents in a single one.
I tried to use PDFMergerUtility class, but I getting an OutOfMemoryError. So I did refactored my code to read the document, load the first page (my source documents have one page only), and then merge, instead of using PDFMergerUtility. And now works fine, with no more OutOfMemoryError.
public void merge(final List<Path> sources, final Path target) {
final int firstPage = 0;
try (PDDocument doc = new PDDocument()) {
for (final Path source : sources) {
try (final PDDocument sdoc = PDDocument.load(source.toFile(), setupTempFileOnly())) {
final PDPage spage = sdoc.getPage(firstPage);
doc.importPage(spage);
}
}
doc.save(target.toAbsolutePath().toString());
} catch (final IOException e) {
throw new IllegalStateException(e);
}
}

Is there a eventlistener in eclipse core for "OnFileOpen"?

I am trying to write a plugin which parses the source code of any opened (java) file.
All I have found so far is IResourceChangeListener, but what I need is a Listener for some kind of "onRecourceOpenedEvent".
Does something like that exist?
The nearest you can get to this is to use an IPartListener to list to part events:
PlatformUI.getWorkbench().getActiveWorkbenchWindow().getPartService().addPartListener(listener);
In the listener the partOpened tells you about a new part opening:
public void partOpened(IWorkbenchPart part) {
// Is this an editor
if (part instanceof IEditorPart) {
IEditorPart editor = (IEditorPart)part;
// Get file being edited
IFile file = (IFile)editor.getAdapter(IFile.class);
// TODO file is the current file - may be null
}
}

Multiple file upload(stripes)

I am trying to make a file upload that will accept multiple files using the stripes framework. I have a single file upload working. I am trying to understand the Documentation for multiple file upload.
According to the doc, all I need to do is modify single file example to:
<stripes:form>
<c:forEach ... varStatus="loop">
...
<stripes:file name="newAttachments[${loop.index}]"/>
...
</stripes:form>
ActionBean
private List<FileBean> newAttachments;
public List<FileBean> getNewAttachments() {
return this.newAttachments;
}
public void setNewAttachment(List<FileBean> newAttachments) {
this.newAttachments = newAttachments;
}
What do I need to replace the ...(in particular, the one in the forEach loop) with to get this example working? Thanks.
Probably that should work:
<c:forEach begin="1" end="3" varStatus="loop">
see Tag forEach doc

How to get the name of a temporary file created by File.tmpfile in D2?

I need to generate a temporary file, fill it with some data and feed it to an external program. Based on description of D available here I'm using File.tmpfile() method:
auto f = File.tmpfile();
writeln(f.name());
which doesn't provide a way to get the generated file name. It's documented that name might be empty. In Python I would do that like this:
(o_fd, o_filename) = tempfile.mkstemp('.my.own.suffix')
Is there a simple, safe and cross-platform way to do that in D2?
Due to how tmpfile() works, if you need the name of the file you can't use it. However, I have already created a module to work with temporary files. It uses conditional compilation to decide on the method of finding the temporary directory. On windows, it uses the %TMP% environment variable. On Posix, it uses /tmp/.
This code is licensed under the WTFPL, so you can do whatever you want with it.
module TemporaryFiles;
import std.conv,
std.random,
std.stdio;
version(Windows) {
import std.process;
}
private static Random rand;
/// Returns a file with the specified filename and permissions
public File getTempFile(string filename, string permissions) {
string path;
version(Windows) {
path = getenv("TMP") ~ '\\';
} else version(Posix) {
path = "/tmp/";
// path = "/var/tmp/"; // Uncomment to survive reboots
}
return File(path~filename, permissions);
}
/// Returns a file opened for writing, which the specified filename
public File getTempFile(string filename) {
return getTempFile(filename, "w");
}
/// Returns a file opened for writing, with a randomly generated filename
public File getTempFile() {
string filename = to!string(uniform(1L, 1000000000L, rand)) ~ ".tmp";
return getTempFile(filename, "w");
}
To use this, simply call getTempFile() with whatever arguments you want. Defaults to write permission.
As a note, the "randomly generated filenames" aren't truely random, as the seed is set at compile time.