Append multiple PDF files - follow-up - pdf

New user here and have a follow-up query from the original post at:
Append multiple PDF files
Firstly the solution given by Jerry Stratton on this thread is amazing and just the sort of thing I have been looking for, so thanks for this info.
I have tried to make an edit to suit my own project but am running into some issues.
I'd like to be able to set the Application to overwrite the 'changing' PDF once the cover page is prepended.
The Move Finder Items gives the option to Replace Existing File (screenshot attached) but rather than overwrite the file, it just generates a new file with a random name.
Any thoughts?
I am a complete novice scripting-wise but I am buzzing at the potential here to improve some of my workflows, any help is massively appreciated.

Related

How to Diff PDF/XFA forms in GitLab

To all whom are concerned:
My boss and I are on GitLab and we have problems trying to differentiate between dynamic PDF files.
Normally, for code files like C# class .cs files, it's easy to double-click and have GitLab highlight the changes made between two different versions.
However, we also create dynamic XFA/PDF files in Adobe LiveCycle and it's difficult to tell what has been changed at times, especially if the commit messages are not too specific or too vague. We know people suggested taking screenshots of the PDF between each version, but you can't diff text changes or format changes on image files.
We tried the program DiffPDF found here:
http://www.qtrac.eu/diffpdf.html
But we found out that it does not work with XFA/dynamic forms.
Does anyone have any suggestions on any possible programs that can diff the actual content on PDFs in GitLab?
Thank you for your time and future advice.
I will agree that it is hard to see what have been changed on a PDF, but if you instead look at the XDP-file you will be able to see what code have been changed.
If that is possible for you.

Word - Is it possible to change easily all links' source in a document?

I have 5 Word template linked to an Excel file to automatically generate new pre-formated reports.
As I am working on improving it, I have created copies that will replace the old version once I'm done. And in each template, I have approximately 70 links, so almost 350 links in total.
My question is :
Is there a way to change easily the source file for multiple links?
Solutions with or without code are welcome!
Is it possible? Yes
How? A Google search reveals multiple options with lots of code to copy/paste.

writing text to a pdf file

I have several pdf files (about 20) and very month or so I need to change spme fields with new data. This is a very time consuming task and would like to know if there is an easy way via some sort of application where users can change the name of the variables that have to be stored into the different pdf files. This would be an enormous time saver. thanks for any help.
there are lots of solutions for this.. if you are willing to write some code things can get really interesting.
a simple solution would be to create a template pdf file with placeholder fields (like #{name}, #{age} etc.,), when a new pdf needs to be created using new values you can simple use itest to edit the pdf & replace the placeholders with actual values.
you can also use jasperreports for this but it would be an overkill for just 20 odd documents.
if you are interested in a sample program i'd be happy to provide you one.
If you have form fields in the PDF file then you may use Aspose.Pdf (.NET or Java version) to fill data into those fields programmatically. You can either fill the fields using individual values or import the data from the XML/FDF/XFDF files etc. You can take a template PDF and save the output PDF files with different values. Please see if this might help in your scenario.
Disclosure: I work as developer evangelist at Aspose.

Find duplicate PDFs

I'm looking for a utility that will help me find duplicate PDFs. The problem: I have a 1000s of PDF files. Some are duplicates. They are not easy to detect due differing files names and small differences in file size. Is there a utility/algorithm/library that can help me find the duplicates or show me files that are very similar (or degree of difference)?
Create an MD5 hash for each file and store it in a database. Identical files will then sort next to each other, or you can quickly search for a pre-existing key.
The problem is not yet solved in any way. What I do, is I use fdupes http://premium.caribe.net/~adrian2/fdupes.html to find exact duplicates.
But most of all, I use a workflow which minimizes duplicates. Every document that enters my system gets indexed with this perl-script I wrote: http://seegras.discordia.ch/Programs/fileindex which puts some name and an md5-sum of it into ~/.fileindex.md5 Now I can change metadata of the local PDF-files or whatever (and run fileindex again), and whenever I accidently download the same file again, I will stil lhave the md5-sum of the original file, and thus can detect whether it's a duplicate.
There's also exif-meta and exif-rename on http://seegras.discordia.ch/Programs/ which help with setting PDF metadata and with renaming PDF-files according to metadata; and if you're tagging all the files correctly, you will end up with duplicate filenames, indicating that they might be the same document within a different file.
If the files were created by the different tools, they could look the same but generate very different results because they are structured totally differently. I made some suggestions in a blog article at https://blog.idrsolutions.com/2010/09/comparing-2-pdf-files/
DiffPDF looks like something that might help you.
I remember that there is a UNIX utility called pdf2txt (see the package poppler-utils). You can try to extract the text from the files and make a textual diff.

Is there a way to access VBA help files from the command line

I'm going to have to write a number of vba modules for a project I'm working on, and would prefer to use SciTe to the built in editer in Office.
SciTe allows you to redirect the effect of hitting F1 to a arbitary command with the selected text as an argument. Is there anyway of using this functionality to search the relevant .chm files?
I'm guessing not, given that the help for vba is spread across multiple files, but I'm hoping someone can prove me wrong...
I'm especially interested if anyone can suggest a way to find out which chm file a particular libraries help resides, just from the fully delimitered name of the function.
Another approach is to use the HTML Help command line program HH.EXE to either show specific pages, or to decompile a particular CHM into HTML files.
Go to the folder mentioned by Lunatik in a command window and enter this command:
hh -decompile html vbaac10.chm
^^
# ac is for Access; use xl for Excel, wd for Word, etc
This will create an "html" folder below it and fill it with most of the files that went into creating the CHM file. The resulting HTML files can be opened directly in your browser, although they won't find their related style sheets or scripts which are addressed by their locations in CHM files. The style sheets and scripts do get extracted though so you can work with them too.
Also take a look at the XML files in the 1033 folder like VB_ACTOC.XML - this is the Table of Contents for the Access VBA help. It contains topic nodes with labels and urls for each item in the help file:
<topic>
<label>CheckBox Object</label>
<url>mk:#MSITStore:vbaac10.chm::/html/acobjCheckBox.htm</url>
</topic>
The mk:etc... url can be put on the HH command line to open that topic in a regular HTML Help window. Also, it shows the source CHM filename, and the relative path of the file when decompiled.
hh mk:#MSITStore:vbaac10.chm::/html/acobjCheckBox.htm
Working from these files, you could put together a script to find/grep files by keyword and show them in a browser, or you could reengineer the files into some sort of database or other lookup capability to work with SciTe's command based help system.
Some sites with more info about using HH.EXE:
HTMLHelp command-line
tips on using the HH command line and links to other sites
KeyHH 1.1
an alternate/supplemental program to HH.EXE for working with CHM files
The main files are held (for Office 2003 anyway) in Program Files\OFFICE11\1033, but accessing pages within them could be a bit tricky as Microsoft have gradually had to reign in the ability to delve into CHM files over the years due to security concerns.
This page (download) has some good info on what might still be possible as far as linking to specific pages inside a CHM
Having said that, I don't think this file is the default help shown to most users nowadays, but it's close enough, missing only the Office 2007 pimping most of the time. The online help seems to be set as default unless you specifically disable it during the Office install. The URLs are, I think, not very SEO friendly so couldn't be guessed. I suppose you could borrow a sneaky trick from scammers and craft URLs that point to the top link on Google, thusly: Range.
EDIT: Google cache link?
Inspired heavily by Lunatik's answer, adding:
command.help.$(file.patterns.vb)=http://www.google.co.uk/search?hl=en&newwindow=1&q=site%3Amsdn.microsoft.com+%222003+VBA%22+$(CurrentWord)
command.help.subsystem.$(file.patterns.vb)=2
to my vb.properties file gives me a reasonable work around (loads a Google search results page with search criteria of:
site:msdn.microsoft.com "2003 VBA" $(CurrentWord)
Obviously no guarantees of it taking me to a helpful page, but then the inline help in the VBA editer isn't all that reliable on that one either...
Can anyone who knows SciTe better suggest a more elegant solution?