Check if pdf are the same - pdf

I have a folder with houdreds of pdf files with different names. Some of them have the same contents the is identical. I would like to find out which pdf files are the same and collect them in another folder. Is there an open source program or online program that can do this check? And hopefully also the movement in a separate folder. Or a VBA macro that can do this.

To find duplicate pdf's in a folder you can use Adobe Acrobat, by Adobe Acrobat you can find duplicate pdf's as well as you can delete them

Related

Can you embed a separate pdf into Indesign and open it after exporting to PDF?

I would like to ask the following if possible. We have a client that wants a separate pdf document, embedded in a main pdf document and opens when you click it. Like the function in MS Word where you can attach another Word document inside a Word document (Word-ception, lol) and you can still open it.
I've tried it in Acrobat Pro with the Attachment and Link tools. Another option was to put the link document in an ftp server for accessibility. but our client really wants this functionality. Is this possible in Indesign?
Thank you!
Using Word as your example vehicle there are several ways to link 2 documents.
One is an appendix to the other, in PDF terms is a merge or binding but its one flowing document with separate sequential sections/chapters.
Another way is to link to an external file, in PDF terms a hyperlink to a relative second file, which can be locally folder relative or a web absolute reference. You have tried that.
In Word we can add objects internally with icons, in PDF that can be an annotation comment attachment to save externally and action accordingly. You also seem to discount that approach.
Finally PDF offers an Adobe Specific Structure where multiple PDFs attachments can be imbedded in an overall PDF wrapper. These are called Portfolios and not! to be confused with their portfolio service
They are unpopular since in a browser without Adobe Reader they should only offer the cover page.
Whilst in securer offline readers the files may well be shown as attachments that you need to save or independently open to view them.
Only some non Acrobat viewers may view them as a collection. And in the past that required runing insecure SWFlash, But I understand that has changed ?
Here is how the 3 internal PDF files seen above were shown in older Acrobat 9.
Possibly the best experience is using Foxit Reader

Open a .pdf file

I am trying to open a .pdf file within Excel like an iframe in HTML.
My requirement is:
Save the path of multiple PDF files in Excel.
Excel should open each .pdf file within Excel itself (no need to open that in a separate .pdf window).
It should be like iframe in HTML. The user should be able
to view the .pdf within Excel itself.
I know this is little weird, but can anybody help me?
you could probably get the filenames via vba.
here's some that claim to work:
Loop through files in a folder using VBA?
So far as opening a pdf in excel - thats kinda pushing it.
Since your request is exotic I can think of an exotic workaround:
If you can spare the interactivity you can simply make copies and convert your pdfs to word formats to work with them and load them in that way. I've seen people convert pdfs to Jpgs just to load them in some other documents but thats rudimentary and really fringe.
Otherwise you are facing a lot of custom coding that needs to make it possible.

How to open a .html document as a .pdf

I have Excel 2010 and Adobe Acrobat DC installed.
I have several thousand files saved as .html saved in a folder.
I plan on doing several things, but at this point, I want to basically convert these files into PDFs. I think I just need to be able to open them as PDFs and then save them as PDFs which I have no idea how to do...
I've been able to open them in the internet browser through vba... but not much after that.
1.copy your url of your .html page
2.go to the http://www.web2pdfconvert.com
3.paste your code there
4.click on "convert" to pdf option
5.now click on "download pdf" button
6.congrets you have done
eo.pdf.dll $650.00+- .net dll that will convert html documents to pdf.
You will have to write some code but you will be able to batch process the html files.
http://www.essentialobjects.com/Products/Pdf/Default.aspx

docx4j word/googledocs compatibility

I'm creating a program which extracts a docx file, displays it in a Javafx graphic interface with buttons in place of flags put in the docx, and when one puts on it, it modifies the docx taken in input.
I'm using the docx4j API for extracting and modifying the document.
The problem is that the program fails if i take in entry a docx generated from Microsoft Word. I'm forced to use an artifice.
I'm taking my docx made on Word, then i load it in Google Docs and I use the "Download in .docx format" option. If i directly put the docx from Word in my program, it fails.
I noticed my Word file was two times lighter after being passed trough google doc. Same, if I tale a docx file downloaded from Google Docs, if i open it in Word and modify one letter and save it, he becomes two times heavier. For the record i use word 2008.
That's it, so I'd like to know if someone know what explains this difference.
Thanks

Using Texmaker, I want to lock the PDF file created so others cannot copy text or print the file

I'm creating PDFs using Texmaker. I would like to create some of the PDF files so that when I give the PDF to others, they are not able to print the file or to copy the text. I know I can do this with some PDF creator applications, but can I do that from some command like program I have with Latex, MikTex and TexMaker?
It wouldn't be effective anyway. There are bits in the pdf format that purport to forbid the user from doing this, but they are really just suggestions that the reader application may or may not act on. There is nothing to stop a user from removing the code that inspects the bits from a free/libre PDF reader, or just to run a tool over the file to remove the restrictions.