Looking for fast "Find in Files" program - vb.net

I currently have a directory with 98,000 individual archive transaction files. I need to search those files for user input strings and have the option to open the files as it finds them or at the end of the search. I'm using Notepad++ currently and, while functional, it's quite slow. I thought about writing my own, but I am only familiar with .NET and I'm a beginner. Also, I'm not sure how efficient that would be compared to NP++.
This tool would be used again and again so the dev time would definitely be worth it if it came to that. Is there some other tool out there that's already developed that would accomplish this?

Agent Ransack
I've been using it for years.

I recommend you using Astrogrep, a grep utility for Windows. You can open files as it finds them, and it shows you the line where the match was found, without having to open the file.

Assuming the archive transaction files are plain text, you can download CYGWIN which is an environment providing UNIX tools for Windows.
Once that's done, you can open a new Cygwin Bash Shell, then do cd 'c:\\foo' to get into the directory with your files, then do grep -F -r "my string" * to find your text. (The -F means it searches for that literal string as opposed to a regular expression and -r means recursive.)

Possibly overkill, but you could index the folder using Lucene, keep the index uptodate (as transaction files are added) and then searches will take trivial amounts of time, you can target the file, line and word number of each match for a given search string

Related

How to read a Executable(.EXE) file in OpenVMS

When am trying to open any .EXE file am getting information in encoded form. Any idea how to see the content of an .EXE file ????
I need to know what Database tables are used in the particular .EXE.
Ah, now we are getting closer to the real question.
It is probably much more productive to ask the targeted databases about the SQL queries being execute during the run, or a top-ten shortly afterwards.
The table-names might not be hard-coded recognizably as such in the executable.
They might be obtained by a lookup, and some fun pre-fixing or other transformation might be in place.
Admittedly they like are clear text.
Easiest is probably to just transfer to a Unix server and use STRINGS on the image.
I want to include the source here with but that failed, and I cannot find how to attach a file. Below you'll find a link OpenVMS macro program source for a STRINGS like tool. Not sure how long the link will survive.
Just read for instructions, save (strings.mar), compile ($ MACRO strings), link ($link strings), and activate ($ mcr sys$login:strings image_to_test.exe)
OpenVMS Macro String program text
Good luck!
Hein
Use analyze/image to view the contents of an executable image file.
I'm guessing you are trying to look in the EXE because you do not have access to the source. I do something like this:
$ dump/record/byte/hex/out=a.a myexe.exe
Then look at a.a with any text editor (132 columns). The linker groups string literals together, and they are mostly near the beginning of the EXE, so you don't have to look to far into the file. Of course this only helps if the database references are string literals.
The string literal might be broken across a block (512 byte) boundary, so if you use search in your editor, try looking for substrings.
Aksh - you are chasing your tail on this one. Its a false dawn. Even if you could (and you can't) find the database tables, you will need the source of the .exe to do anything sensible with it, or the problem you are trying to solve. Its possible to write a program which just lists all the tables in a database without reading any of 'em. So you could spend and awful lot of effort and get nowhere. Hope this helps

How to remove .efs file extension from 1000's of recovered files in one folder

I recently recovered a 1.5TB external HDD that crashed. The program I used to recover the files was Active Undelete Enterprise, it's excellent. When the files were successfully recovered they were all saved with a .efs extension so files looked like mydocument.docx.efs. At first I thought they were encrypted and needed to be decrypted, I spent 10 mins on it and realized I just need to remove the .efs from the entire filename and the mydocument.docx works perfectly. Problem is now I have over 55,000 files within hundreds of folders where I need to simply remove the .efs after each file. Does anyone know how to do this?
From a command prompt window, navigate to the top level directory where these files reside.
Type the command
DIR /S/B >>filelist.txt
This command will give you a bare format file listing of the current directory plus all nested subdirectories without any extraneous information. The list will be contained in the text file named "filelist.txt" or whatever else you choose to call it. I would then use this text file in a text editor to convert every line of text from, for example,
C:\Users\dlucas\.gimp-2.8\mathmap\file1.png.efs
to
rename c:\Users\dlucas\.gimp-2.8\mathmap\file1.png.efs file1.png
to give a simple example of a file that I just found on my system using this method.
You will need to use a text editor with a columnar editing capability since you have to modify som many files. Old programmer's editors such as CodeWright made this really simple while modern editors such as Eclipse or Notepad++ make this a little more difficult and may require a columnar editing plugin, depending on version. You basically have to make a columnar copy of all of the text in the file, and then paste the copy off to the far right - far enough that a second column of filenames and paths won't overwrite any of the existing file names and paths. You can then use columnar editing features to select and delete the path names of the text in the 2nd column since the rename command requires that the 2nd argument be simply the base filename and extension without the path information. You can use the columnar editing features to prepend every line with "RENAME ". If you attempt to do this without columnar editing features, you will find it slow going!
An alternate way to do this is to use a command formed from a "regular expression" to create the rename command. If you are not familiar with "regular expressions", ask a programmer friend as this is not an easy topic to learn from scratch. If you are familiar with regular expressions, this is probably the simplest way to perform this task. I haven't used them in many years and no longer recall the exact syntax to use or I would tell you myself.
Regardless of what kind of editor you use, the goal is to turn this ASCII file list of paths and filenames into a batch file (simply rename file1.txt to file1.bat when you are finished editing). You can then run the batch file by typing file1.bat at a command prompt.
I have just run into this same problem myself using the same really wonderful tool that you used. I am writing this while waiting for the undelete program to finish. That it restores files with this extra extension seems very anti-intuitive so I will look for an option to make it not do this when it finishes. If I find one, I will post a new answer here that is more specific to this tool. Otherwise, I am going to have rename all kazillion files just as you had to.
You experienced this problem because the disk that you recovered your files to "does not support encryption", according to the Active# UNDELETE documentation. The documentation offers no further explanation of what kind of disks support encryption, etc.
They offer a Decrypt command that restores the file's proper names as a post processing step. Unfortunately, this requires that you "include" each and every file to be decrypted, with no support for wildcards and parsing subdirectories so that is a non-starter, in my opinion given that both of us have hundreds of thousands of files to be renamed.
I did find that by selecting a normal fixed (non-removable) hard drive as the destination of the recovery effort, that the resulting files do not end up encrypted (i.e., they are recovered with the proper file name and extension). I originally chose a large USB based flash drive and the files were stored in their "encrypted" state (not really encrypted, but possibly potentially so and thus they give the .efs extension). Of course, this meant that I had to run the command all over again after switching to a regular hard drive (takes about 16 hours to recover 80GB worth of files due to presence of many sector CRC errors).

What are some problems with having $ in file names?

I'm trying to convince fellow developers to not include $ in file names but couldn't come up any argument other than some Perl scripts may need to be updated. Any suggestions?
A quick glance at Wikipedia's Filename page reveals that only zOS and Commodore DOS disallow the user of $ appearing in certain file names. I can see why you would have trouble coming up with a reason not to use them.

mercurial version control with word

This is a followup to svn or mercurial version control of word documents
I potentially want multiple non-programmers to be able to use version control on word documents. I can configure mercurial to look at the unzipped docx files. What I want is as follows:
Read from Docx files (answered in that question, using a feature of mercurial to unzip before comparing, awesome!
automatically merge documents whenever there are non-colliding changes. It appears from the previous answer that this is done using comparison tools.
programmatically run word on the two documents if there are collisions, comparing the two.
I have manually opened one file, then another in Word to see what it was like. On my word 2004, it seems a bit buggy, but I see from reviews that the feature is much improved in 2010.
I found this link:
http://office.microsoft.com/en-us/word-help/command-line-switches-for-microsoft-office-word-2007-HP010164010.aspx#BM1
for command lines, and now see that I can execute the command:
winword /q /f file1.docx /f file2.docx
The q is for quiet, /f specifies a file. The docs don't say if I can specify two files but I tried and it loads two in separate windows.
So the only thing I don't know is how to trigger word to compare the two.
Is the word interaction a fairly easy scripting job, or does it involve binary APIs that I don't want to know about, like DCOM, ActiveX, etc.
Digging around in the TortoiseHg directory, I found some examples of scripts implementing diff/merge of doc files in the diff-scripts directory. There is an [extdiff] section in Mercurial.ini that can be configured to use this scripts. This may get you started.

files opened by a process on VMS

I have a DCL script on VMS which calls a perl script. Is there a VMS/DCL command I can use that will tell me every file handle opened by the perl script?
Set default to the disk the app runs from (or you might have to try each disk in succession if it's a really large or distributed app). Then the command is
show device/files/nosystem
If you're on a more recent version of VMS and the lists are too long, you can pipe it with a search by doing this:
pipe show device/files/nosystem | search sys$input (name of perl script)
You need to find the documentation for undocumented VMS features :-)
Seriously I think that set watch might do what you want. If you issue
$ set watch file/class=(all,nodump)
$ perl yourperlscript.pl
You will get loads of output that will hopefully include what you want. I havent done it for years, you probably tune the options to fine tune it. See
http://www.parsec.com/openvms/undocumented.php?page=13
Jason, I need more clarification for a). Are you saying that you want to run your perl script in a batch file and have the batch file monitor the files being accessed by the perl script? Or something else?
Hmm, not sure about that. Maybe add a linux tag to your post so that some linux people can see this and chime in. I'm not sure why your perl program wouldn't know what files it opened. It's your program, wouldn't it access the files you told it to access? Or if you're computing the filenames somehow (which I've done in cobol, but still know at least which directory to find them in, and what naming scheme they use), you'd still have clues like what I mention. Also, since it's your program, and if you're computing the filenames, coudn't you also make your Perl program output it's own little report of what the files were? Like, just after it computes the filename, have it copy the name string to a separate report file.