Multi Threading with VB.NET - vb.net

I've written a hot folder utility which works great when one file at a time is dropped. When a file is dropped into the hot folder, I add the filename to an array and then use a timer event to see if the array is empty and if it isn't, take the 1st value and start processing.
If I drop 2 files the first file is processed before the 2nd file is processed. I've been researching Background workers and struggling to implement. Any insight on how to get each timer event to trigger in a new thread easily?

Related

SSIS - Why won't my Data Flow Task fail?

I've got a simple SSIS package that runs a 'foreach' loop, checking a folder for .csv files. It imports the contents of the CSV into a staging table where the columns map. On success of this, it moves the file to an archive folder appending the date. Where it fails, it is supposed to put the file into a failure folder.
However, i've tested with a random csv, that doesn't have column headings that match the mappings, and the data flow task DOESN'T fail & the file goes to the archive folder (of course the table isn't updated either). Any ideas as to why this is happening?
Here is the package:
Here is the data flow:
OK, I can do this.
Start with seven text files of input data, one of which contains error data.
The control flow executes like this.
The good files get moved to the ProcessedData folder.
The bad file gets moved to the ToReviewData folder.
The only setting you need to make is MaximumErrorCount on the Foreach Loop Container. Set this to a suitably high value.
I haven't changed any of the properties on the Load Cats task. In particular, you can see that FailPackageOnFailure is False; this is only required for checkpoints.
The precedence constraints are as you'd expect. Nothing clever here.
See training kit 70-463 > Chapter 4: Designing and Implementing Control Flow.

How to delete large file in Grails using Apache camel

I am using Grails 2.5. We are using Camel. I have folder called GateIn. In this delay time is 3minutes. So Every 3minutes , it will look into the folder for file. If the file exists, it will start to process. If the file is processed within 3 minutes, file get deleted automatically. Suppose my file takes 10minutes,file is not deleted.Again and again, it process the same file. How to make file get deleted whether it is small or bulk file. I have used noop= true to stop reuse of file. But i want to delete the file too once it is preocessed. Please give me some suggestion for that.
You can check the file size using camel file language and decide what to do next.
Usually, in this kind of small interval want to process a large size of file, it will be better to have another process zone (physical directory), you have to move the file after immediately consuming it to that zone.
You can have a separate logic or camel route to process the file. After successful process, you can delete or do appropriate step according to your requirement. Hope it helps !!

Check for multiple files

Okay, I'll try to explain as good as I can... Quite a particular case.
Tools: SSIS 2008
We have a control flow that now needs to be triggered by an event: the presence of one or multiple files. (1,2 or 3)
The variables used:
BO_FileLocation_1
BO_FileLocation_2
BO_FileLocation_3
BO_FileName_1
BO_FileName_2
BO_FileName_3
There can be one, two or three files: defined in above variables. When they are filled in,
they should be processed. When they are empty, this means there's just one file file, the process should ignore them and jump to the next (file watcher?) task.
For example:
BO_FileLocation_1= "C:\"
BO_FileLocation_2 NULL
BO_FileLocation_3 NULL
BO_FileName_1= "test.csv"
BO_FileName_2 NULL
BO_FileName_3 NULL
The report only needs one file.
I'd need a generic concept that checks the presence of these files, it could be more generic than my SSIS knowledge can handle right now. For example handy, when there's a 4th file in the future. I was also thinking to work with a single script to handle all the logic.
Thanks in advance
A possibly irrelevant image:
If all you want is to trigger the Copy Source File to handle if one or more of the files is present, just use the OR Constraint in your flow. The following image shows you how:
First connect all to the destination:
Then click one of the green arrows. This will make its properties window pop up. Select the Logical ORinstead of the Logical AND:
If everything went well, you should now see the connections as dashed lines:
There are several possible solutions:
Create a sequence container and include all the file imports in the sequence container. Add int variables for RowCountFile1, RowCountFile2, and RowCountFile3 and set the value to 0 (this is the default value when you create an int variable). Add a RowCount transformation to each of the data flows. Create a precedence constraint from the sequence container to the "Do something" task. Set the precedence constraint to success and expression. Set the expression value to #RowCountFile1 > 0 || #RowCountFile2 > 0 || #RowCountFile3 > 0. The advantage of this approach is that you can take an action as soon as the files are detected, you import all available files, and you only take an action after all the files have been imported. You could then schedule running this SSIS package as a SQL Server Agent job step and run it as frequently as you want.
A variant on solution 1 is to use for each file enumerator containers inside the sequence container. This would be useful if you don't know the exact name of the file and you expect to import more than one under some circumstances. For instance, if you get a file every few minutes with a timestamp in its file name and your process doesn't run for some reason, then you may have to process multiple files to get caught up and then take an action once it has been done.
You could use the file watcher task as you outlined in your question. The only problem I have with the file watcher task is that the package has to be in a constantly running state. This makes it hard to troubleshoot problems and performance. It also can introduce other problems since I remember having some problems with the file watcher task years ago when it first came out. It may well be a totally stable task now, but I prefer other methods over the task after having been burned previously. If you really want the package to run continously instead of having it be called by a job, then you could always use a script task to check for file, sleep thread if not found, check again, etc. I'm sure that's what the file watcher task does, but I would trust my own C# over the task. Power to anyone who has had better experiences than me with File Watcher...
Use PowerShell. If you just want to take an action if a file appears and you aren't importing the data, then a PowerShell script could do this just as well as a SSIS package. The drawback is that you have to learn some basic PowerShell, it may be hard to maintain in the future since PowerShell is probably not your bread and butter core language, and you may have to rewrite the code again to a SSIS package if you want to import the data. You would probably call the PowerShell script from a SQL Server Agent job step, so scheduling can be handled pretty easily.
There are more options than what I listed, so let me know if you still want more suggestions.

tracking file renaming/deleting with FSEvents on Lion

I'm trying to use FSEvents to detect when files were added/removed from a specific folder. For the moment, I implemented a simple wrapper around FSEvents, and it works fine : I get all the events.
BUT the problem I have now is that when I rename a file in the Finder, I catch 2 distinct events : the first one of type "renamed" with the old file name, and another one with "renamed" and the new filename. The event ids are different between both calls.
So, how am I supposed to know which "renamed" event contains the old name, and which event contains the old one ?? I tried looking in the documentation, but unfortunately, kFSEventStreamEventFlagItemRenamed is not documented ... it seems new in Lion.
PS: the only way I could think of was : on a renamed event, I check my UI to see if I have an item corresponding to the event path. If so, I flag it for renaming. If not, I check if an item was flagged for renaming, and if so, then I rename it to the new event path. But I really don't like this idea ...
Edit: Ok, I imlemented something along the line of my "PS" : I noticed that when renaming something, the ids of the 2 events are consecutives, so that with the id of the event containing the new name, I can get the event containing the old name. I simply use a little dictionnary in my interface to store ids and associated paths in the case of a "renamed" event.
Anyway, I can now catch rename events, and even move events : when you move a file, it's a "renamed" event which is caught by the FSEventStream ...
But, I still have one last problem : deleting. When I delete something, it's moved to the recycle bin : I receive a "renamed" event. But the problem is that I don't receive the second rename event. Only a "modified" event on the .DS_Store file. I think this file is used by the Finder to know which files are in the bin, etc. So I could check modification to this file, and get the last "renamed" event to detect that a file was sent to the bin. But I'm using TotalFinder which uses Asepsis, which modifies the way the Finder stores .DS_Store files : I no longer receive "modified" on this.
To sumarize : I can't detect when a file is sent to the bin ...
Any idea how I can do that ? Maybe use something else than FSEvents to catch only this event ?
Well, I didn't find the perfect answer to my question, but I found a solution which I eventually was really satisfied with, so I thought I might share ^^
As I said, when moving stuff to the trash, if you're only watching 1 folder, you won't catch the event generated when the image is put in the trash. So, I decided to do the following :
I have a class which creates a stream on the root folder ("/") so that it will catch all the events -> this solves the problem of files being sent to the trash, and all such stuff. Then, this class allow to register delegates on certain pathes. So, instead of creating many streams, I create one big stream, then filter events as needed, and I create many delegates.
So all I have to do now when I want to watch events on a special folder is the following :
[[FSEventsListener instance] addListener:self forPath:somePath];
I just have to create an instance of FSEventListener at application start, and release it when the app stops.
And I just need to implement the following 3 methods which will be automatically called :
-(void)fileWasAdded:(NSString *)file;
-(void)fileWasRemoved:(NSString *)file;
-(void)fileWasRenamed:(NSString *)oldFile to:(NSString *)newFile;
If you're interested in the source code of this little utility, you can check here : http://blog.pcitron.fr/tools/macosx-imageviewer/ (the utility was added at the version 0.8)
I developed it as part of a a little image viewer to keep the UI synchronized with the disk content (it displays the number of images contained in each directories, etc.) The source code is available, and the utility is in Utils/FSEventsListener.h/.m.
And if by any chance someone actually downloads the application and take a look at the sources, if you find anything usefull (performance / feature improvement, whatever) feel free to drop a comment / mail ^^
You are actually raising two issues related to FSEvents and renames.
1. A file is renamed and both the old and new file names are within the directory trees being monitored.
2. A file is renamed and one of the names is not in the directory trees being monitored.
You have solved (almost) the first issue. It is also necessary to provide your application with a means of knowing which events are being reported in the same FSEvent group of events. Your method of knowing that two renames are reported consecutively only works if they are within the same group of events being reported within the same latency period. If two rename events of type 2 occur one after another but are not within the same group of events being reported in the same latency group, they actually have nothing to do with each other - and you will mistakenly think one file has be renamed to another.
It is possible to handle the second type of rename by simply monitoring every directory in the system using the root, but this will flood you with many unnecessary events. You can determine if a "partial" rename is the result of a file being moved out of the directory tree being monitored or into the directory tree being monitored by doing a stat() on the file. If stat() fails with an errno of 2, then the file has been moved outside the directory being monitored, and it can be treated as if it has been deleted. If stat() succeeds, the event can be treated as if the file has been created.

Why File System Watcher is almost blind?

I am using FileSystemWatcher in order renaming files within a Watched directory.
The problem occurs if the number of files copied simultaneously to the watched directory exceeds the number of 50...
The rename event is fired successfully for the first 50 files, but after that nothing happens
Any suggestions please?
You'll need to give it a bigger InternalBufferSize. And repond quickly to change events. Queuing them, then processing the notification in another thread is best. That also helps you deal with the inevitable locked file problems.