document migration name extraction - sql

I have a scenario and would like to see if anyone has any suggestions on how I should tackle it. Basically I have a directory full of files, document names consist of [Code]-[number]-[text]
CODE - A generic 3 letter code.
NUMBER - a number generally 4 - 5 digits in size.
TEXT - original document name (Before it was dumped).
CODE, NUMBER and TEXT are separated by a colon (-). Number always starts at the 5 character.
I would like to somehow scan that directory and extract the number from the filename, I would then like to compare that number to a field in a database (SQL query fairly straight forward, could also extract as raw text) If the number matches the number in the database I would like to separate those files.
If I need to clarify anything please ask. I wasn't sure if this site is appropriate for my query.

Open the root folder, click in the file explorer path (in open space off to the side so the whole path gets highlighted), type cmd and hit enter to open a command prompt from that folder location.
Type: dir /b /s > filelist.txt to get a list of all file names. You can exclude /s if you don't need/want to dig down into subfolders.
I'd paste that into excel, if you have 2013 you can just start typing the part you want to extract, after you type the full first line when you start typing the next line it will recognize the pattern and you can just hit enter to fill down.
Otherwise, use Data > Text to Columns and specify - as a delimiter.
Likewise you could just import the filelist, separate them in SQL using SUBSTRING() or similar. When you have your matching filenames you can just use some concatenation to build a COPY or MOVE .bat file, pretty easy in SQL or Excel.

Related

Search for Multiple Strings in PDF and Word, Return Page Number(s) Where Strings Appear - Power Automate

I have a list of strings (e.g. "A3.11.2.3", "A3.2.1" and "A12.1.3(b)") and need to find a streamlined way to extract the page number(s) on which each sting appears from PDF and Word files.
The list of strings is fixed/can be hardcoded though it would be better if they were read from a particular excel file. The Flow I am trying to create is:
When a file is created;
Search file for list of strings and return all page numbers on which
strings appear with each page number separated by a comma;
Populate a Microsft Word template with each string's page numbers
(i.e. a template table will be created with one string on each
row and in the column beside the page numbers will be populated).
Items 1 and 3 are easy, item 2 has been destroying my brain for how to implement.
The files to be searched are most often PDFs (always file created/no need to add OCR) but occasionally include word documents.
All ideas welcome!

Comparing two files using BeyondCompare - check for content

I have two text files containing many lines of data (they are just some linux paths). The order of the paths are different in both files. I need Beyond Compare to compare the files based on content. Right now, it is checking line by line and pointing out errors if the same content is not present in the corresponding lines. I want beyondcompare to go through the entire file before saying that some path is missing. How to do it?
You can make Beyond Compare 4 sort the files before comparison. Open the files in the Text Compare, then click the dropdown on the right side of the Format toolbar button and select Sorted.

Pick a random line from a text file and store it in a variable (Python 3)

I am trying to code a program which reads a file, which will contain many words (one word per line), then selects a random line (word) from the file, so I am able to store it in a variable for me to use later on.
I don't really know where to start as I am not very experienced. Any help would be appreciated.
Well first you will need to open the file
file = open('filename.txt', 'w')
Then you need to read the file you can read each line into a list by doing words = file.readlines (this can also be done with a loop or in a number of other ways)
Then you can use the random module to generate a random number and get the word from that index in the words list. Then just store that word to a variable.
There are other ways of doing this but this is one of the easiest.

beyond compare (4) how do I ignore case in file content when comparing folders

In beyond compare how do I ignore the comparison of some words that are contained in file inside the folder.
exemple:
in left side, i have a file in a folder that contains the word 'Hello'
in the right side, at same place, the file contains 'Tello'
I would like this to be an equal file.
also, how to ignore case in file content when performing a folder comparison. (not when doing file comparison) => in all files contained in the folder
To hide text differences in Beyond Compare 4's Text Compare, see the article Define Unimportant Text in Beyond Compare on Scooter Software's website.
To ignore differences in text file content in the Folder Compare, double click a pair of files to open them in the Text Compare. Follow the instructions at the above link, but change the dropdown of the Session Settings dialog from Use for this view only to Use for all files within parent session before you click OK. Then close the Text Compare.
In the Folder Compare, click the Rules toolbar button (referee icon). In the Comparison tab, check Compare Contents and select Rules-based comparison. This will compare the text contents of files in the Folder Compare using the settings you defined in the child Text Compare.
I got case insensitivity when comparing *.txt files using "view"->"Ignore Unimportant Differences"

Append multiple PDF files

I have about 600 PDF files that I want to add a single disclaimer page to the beginning of each of them. So, I need to find a way to merge two PDF documents where one file is always the same and and comes first and the second file is changing.
Please let me know how I can do this.
Thanks!
I found this:
http://gotofritz.net/blog/howto/joining-pdf-files-in-os-x-from-the-command-line/
So you could do something like this in a shell script:
#!/bin/bash
page1="disclaimer.pdf"
for f in {a.pdf b.pdf c.pdf}; do
"/System/Library/Automator/Combine PDF Pages.action/Contents/Resources/join.py" -o "$f" "$page1" "$f"
done
You can wrap this in an Applescript or automator workflow it you like.
Combine an unknown PDF with a known PDF
Automator can do this. You would save it as an application, so that you can drop the second file onto it. Your steps would be something like:
Get Specified Finder Items: Add the known PDF document here, the one that is “always the same and comes first”.
Combine PDF Pages
Move Finder Items: To: Combined Files. This will create a randomly-named file in the folder “Combined Files” which you will need to create.
If this Automator workflow is saved as an application, you can drop your second, "changing" file onto the application icon. The workflow will combine the two, putting the named file first.
Loop this workflow for each dropped file
In order to do this on multiple files at once, you will need to create a second workflow that loops through each dropped file and calls the above workflow. This second workflow is also created as an application. It has only one step:
Run Workflow: choose the workflow created in the previous step, and process items in batches of 1 at a time using 1 workflow.
Give the files a more usable name
That’s the basics. The obvious drawback is that all of the files have random names. That can be fixed by grabbing the original filename into a variable, and saving the new, combined document using that name in your new folder.
First, add four new steps to the top of the first workflow, in front of “Get Specified Finder Items”:
Run AppleScript
Set Value of Variable: name the variable filepath.
Run AppleScript
Set Value of Variable: name the variable filename.
Get Value of Variable: filepath
The first two steps save the full path to the dropped PDF. The second two steps get just the filename portion. The fifth new step pops the full path back into the workflow so that it can be combined with the known disclaimer.
Set the AppleScript in step 1 to:
on run {input, parameters}
return input
end run
Set the AppleScript in step 3 to:
on run {input, parameters}
tell application "Finder"
set filename to name of file input
end tell
end run
Then, add a step at the end to rename the file. After Move Finder Items add *Rename Finder Items. Choose “Name Single Item”, “Full name to” and then drag the variable filename up to the text box.