How to delete large file in Grails using Apache camel - apache

I am using Grails 2.5. We are using Camel. I have folder called GateIn. In this delay time is 3minutes. So Every 3minutes , it will look into the folder for file. If the file exists, it will start to process. If the file is processed within 3 minutes, file get deleted automatically. Suppose my file takes 10minutes,file is not deleted.Again and again, it process the same file. How to make file get deleted whether it is small or bulk file. I have used noop= true to stop reuse of file. But i want to delete the file too once it is preocessed. Please give me some suggestion for that.

You can check the file size using camel file language and decide what to do next.
Usually, in this kind of small interval want to process a large size of file, it will be better to have another process zone (physical directory), you have to move the file after immediately consuming it to that zone.
You can have a separate logic or camel route to process the file. After successful process, you can delete or do appropriate step according to your requirement. Hope it helps !!

Related

how can I safely import files to sql server in ssis while new files are actively being written to the source directory?

I need to import many xml files into sql server every day. I was thinking of running a for each loop container every few minutes to import the files to the db table and then move them to another directory, but sometimes over a dozen new files are written to the source folder every minute. Is it going to be an issue if the Package tries to loop through the folder at the exact moment new files are being written to the folder? If so, how can I work around this?
You could loop over the files in a script task and attempt to move them to a separate "ReadyToProcess" folder in a try/catch. Catch the IOException if the file is in use by another process, and continue on to the next file. The skipped file will be picked up on the next run. Then loop over the files in "ReadyToProcess" to read them into the database.
It seems like you know what files are finished writing and what files are still being modified which makes things a little easier. It is important to remember: if your SSIS task tries to open a file this currently being modified or used by another process the SSIS package will fail.
You can work around this by using a script task to generate a list of files in your source folder at a point in time and use a for or foreach loop to only fetch the files that are in the generated list. This would be in contrast to fetching everything that's in your source folders, as your post implies.
Other solutions would be to batch your incoming files and offset the package execution time so there isn't a risk of the file being exported to SQL as it's imported into your source folder.
For instance, loading your source documents in batches every 30 minutes: 1:00, 1:30, 2...
and execute your SSIS task every 30 minutes, but offset from the batch by 15 minutes: 1:15, 1:45, 2:15...
Lastly, if possible, run your SSIS package at a period where there will be no new files written to your source folder. While not always possible, if you knew there wouldn't be any new documents coming in at 2AM that'd be the best time to scheudle your SSIS package.

I need to automate a process to open files created by another application

I have a JAVA application on a server that creates log files.
I have, in other server, an application (VB.NET) that process those logs to search some strings; if I manually select the file it works all OK.
Now I need that the second application automatically open every new log file on the remote server every one minute.
So I wonder if there's any way to know when a new log file is created or what are the new files created since last minute.
The files have the following format name server.log.*.log so the files with different format name must be ignored.
More info:
There's a second JAVA application on the first sever that delete logs older than one day.
You can use Directory.GetFiles every minutes to get the list of all the files in the folder. Compare it to the previous list (that you keep in memory) and process the new files.
An other option is to monitor the folder for any changes. This can be done using the System.IO.FileSystemWatcher class. By setting the Path and the proper NotifyFilter you can see which files were created with the Created event as soon (or almost) as they appear.

Vb.net - FileMovement Lock on destination folder

Source code: vb.net
We are using File.Move() method to move the file from source to destination locaion.
But the destination location is being monitored by one tool, whenever we are moving files to the destination location, it will pick up the file and process it. The issue here when we try to move huge volume file like around 5GB file, the tool is immediately picking up the file and try to process it before the move operation is complete and send failure notice to all the users.. After again successfully moving the file completely, it picks up the same and process it sucessfully this time and send successful notice this time.
We can't have control over the tool which is monitoring the destination folder because it is a third party tool. However we want to find out the alternative option to place a lock over the destination foler like ReadWrite access till the move operation completes so that 3rd party will not be able to pick up or try to access that file.
Pls help us.
Not sure if it works, but you might be able to make the following work with directories as well:
FileOpen(1, "c:\file.ext", OpenMode.Binary)
Lock(1)
'Do something with file here
Unlock(1)
FileClose(1)
Reference and example here
I hope it helps.
First, I agree with #hometoast, sometimes tools like this just look for specific file extensions so you can copy in with a different file extension and then rename.
But barring that, download the file to a temp location and then Move the file into the dir getting watched. A Move does not recopy the file contents but just updates its pointers in the filesystem. Should be atomic.

How do i force a file to be deleted? Windows server 2008

On my site a user may upload a file (pic, zip, audio, video, whatever). He then may decide to replace it with a newer revision. This user may upload a file, make a post then decide to put up a new revision replacing the old (lets say its a large zip or tar.gz file). Theres a good chance people may be downloading it if he sent out an email or even im for the home user.
Problem. I need to replace the file and people may be downloading and it may be some minutes before it is deleted. I dont want my code to stall until i cant delete or check every second to see if its unused (especially bad if another user can start and he takes long creating a cycle).
How do i delete the file while users are downloading the file? i dont care if they stop i just care that the file can be replaced and new downloads are the new revision.
What about referencing the files indirectly?
A mapping script, maps a virtual file entry from your site to a real file . If the user wants to upload a new revision of his file you just update the mapping, not the real file.
You can install a daily task that scans all files and deletes all files without a mapping and without open connections.
lajuette's answer is right, the easiest solution is to work around the file locking altogether:
When a user uploads file foo.zip, internally store it as foo-v1.zip.
Create a mapping file somewhere (database, code, whatever) that maps foo.zip to foo-v1.zip.
Rather than exposing a direct link to the file, expose a link to a service that gets the file: mysite.com/Download?foo.zip or something. This service uses the mapping to determine which version of the file to send to the client.
When a new version is uploaded, create foo-v2.zip and update the mapping file.
It wouldn't be that hard to write a scheduled task that cleans up old, un-mapped files.
If your oppose to a database and If the filenames are in a fix format (such as user/id.ext) you could append the id with a revision number and enumerate the folder using a pattern (user/id-*) and use the latest revision.

Start external process several times simultaneously

I need to start an external process (which is around 300MB large on its own) several times using System.Diagnostics.Process.
The only problem is: once the first instance starts, it generates temporary data in its base folder (where the application is located), so I can't just start another instance - it would corrupt the data of the first one and mess up everything.
I thought about temporarily copying the whole application folder programmatically, so that each instance has its own, but that doesn't feel right.
Could anybody help me out? Thanks in advance!
Try starting each copy in a different directory.
If the third-party app ignores the current directory, you could make a symlink to it in a different folder. I'm not necessarily recommending that, though.
Pass an argument to your external process that specifies the temp folder to use.