How to create a PDF/E file with Video for long-term archiving system? - pdf

I am trying to create a PDF/A file for long-term archiving system. My PDF file has a Video. I check in Wikipedia PDF/A doesn't accept video content. PDF/X is for graphics exchange and last, PDF/E is for the dynamic content like videos. I am trying with Acrobat Pro to create a PDF/E file, but at the end i am getting a failure message that Acrobat can not create the PDF/E file. There is this Preflight inside Acrobat Pro but it has the same failure again during the convert process. Is there any other solution to create a PDF/E file with video inside, or even is it possible to create a long-term archiving PDF with a video inside?

The short answer is no - there is currently no archival standard for PDFs with embedded video. However, there is little evidence that standard PDF is going to go obsolete any time soon, so I think the best strategy may be simply to wait for the standards to catch up.
If you're seeking a little more reassurance, you could check if other PDF renderers support your files. If one or more of the open source implementations have no problem with it, then there's very little reason to suppose the content is at risk.
If you really want to protect against problems with PDF, you could diversify somewhat and make a HTML+MP4 version of the same content. That would ensure that even if the PDF strategy fails, there's something else to fall back on. Having some or all of the content of one item available in alternate formats also makes any future format conversions easier to validate.

Related

SOLVED: Looking for a way to automate generation of internal PDF hyperlinks

I have a 300+ page PDF document which needs to have internal page links added to it to reference other pages in the document. The document is created in Visio, which does not support consistent hyperlink generation in PDF export, so the link generation needs to be done on the PDF itself, not up the chain. This is an annual need, and regularly takes over a week due to the amount of manual labor, time, and checking needed.
The text which is hyperlinked has the same format in every case (e.g., "See Section 8.18 - How to Hyperlink"), and I'm certain this can be automated, as there are commercial plugins which can do this, but they cost hundreds of dollars, and are not able to be used in this case due to restrictions imposed by my employer. Example: https://www.evermap.com/ABAddingHyperlinks.asp
I've been looking through the Acrobat Plugin SDK and it seems doable, but I know there is also a higher level scripting language available for Acrobat. Does anyone have experience working with PDFs or with the Acrobat scripting / SDK tools? Are there open source methods for doing this? I've looked everywhere! Willing to learn. I've looked at Ghostscript (Adding internal hyperlink to a pdf) but what I need is way more than just a Table of Contents, and links can appear in many places on the page with line breaks, so consistency is a challenge.
EDIT: I found a solution! Bluebeam software's Revu Extreme works pretty darn well, and can be used as a 30 day free trial of all features. Only limitation is that links which extend across a line break (multiple lines of text) do not properly work in Edge or Chrome's PDF viewer, as they don't properly support hyperlinks with multiple click regions. I've submitted a ticket requesting a feature be added to Revu that fixes this, but for now those links need to be manually fixed following the batch link. The process is described here: https://support.bluebeam.com/online-help/revu2018/Content/RevuHelp/Menus/Batch/Link/Batch-Link--T.htm
EDIT: I found a solution! Bluebeam software's Revu Extreme works pretty darn well, and can be used as a 30 day free trial of all features. Only limitation is that links which extend across a line break (multiple lines of text) do not properly work in Edge or Chrome's PDF viewer, as they don't properly support hyperlinks with multiple click regions. I've submitted a ticket requesting a feature be added to Revu that fixes this, but for now those links need to be manually fixed following the batch link. The process is described here: https://support.bluebeam.com/online-help/revu2018/Content/RevuHelp/Menus/Batch/Link/Batch-Link--T.htm
You can add hyperlinks to a document with Ghostscript, but you would need to know the location of the text to hyperlink and the destination in advance, you cannot automate it or in fact write any reasonably simple code to automate the task using Ghostscript. You'd need to modify chunks of the PDF interpreter, which is written in PostScript and is not a task for anyone not a PostScript expert.
You could probably do it with MuPDF, and probably using MuJS to script it, but I don't know enough to be certain. It would still require some coding effort, but it would probably be easier to use JavaScript at least.

Render PDF in Xamarin Forms from a Stream

I have been searching the web however I have come up empty so felt the need to ask. We want to render a PDF file on iOS, Android and UWP through Xamarin Forms and the most important part, from a Stream.
I have come across answers like this however they just reinforce the notion of loading from a file or url.
We are not allowed to store the PDF files unencrypted on disk so the only 2 possible options I can see are to:
Find a viewer that can render from a Stream
Implement/expand a viewer that can render from a Stream
I haven't been able to find much based on these options so I am either hoping for someone to know of some framework or method of achieving this or at least some form of starting point library wise.
PDFTron PDFNet SDK is available for all the listed platforms, and Xamarin, and supports opening and viewing a PDF from a stream (no disk access required).
https://www.pdftron.com/pdf-sdk/xamarin-library
https://www.pdftron.com/documentation/xamarin/guides
While PDFTron was the only supplied answer I encountered great difficulty firstly getting any information from the company themselves in order to get costing information and secondly the trial downloads and samples wouldn't even compile.
I actually did some further research in to paid for solutions and found that SyncFusion offered a PDF viewer control that could also render from a Stream. They also provided answers to all my questions and got us up and running within less than a day.

Selecting text and image from pdf through any programming language

I'm trying to develop a tool/web application such that it will import a PDF file and I need to select text and images available in PDF by selecting them with a mouse click and marking them as title,content and image with a button click (3 different button) where the marked contents and image will be copied to clipboard or will be pasted into a word document which is going to be a another part. So in which programming language is this possible to work with and carry on ?
I'd probably try researching pure browser-side solution using pdf.js and clipboard API.
Otherwise, you'd still need clipboard API in the browser and the server-side may actually be powered by any programming language which can be hooked into a web server and has a library to parse PDFs.
You said nothing at all about your prospective server platform but to name a few, .NET has PdfSharp which is able to read PDFs, Python has a host of tools available for it. After all, there exist a bunch of command-line utilities to extract data from PDF which can be called using any PL able to call external processes.
Note that this only appears to be a simpler solution than using pdf.js but note that unless your PDFs are really uniform (say, invoices created by some piece of software), and so you'll be able to make your PDF parser know which bits of data it has to extract and return, the parser will need to returl all the data it extracted to the client, and you'll need to somehow render it all there. May be it's exactly what you need but maybe not.
Since PDFs are really tailored for typesetting and not presenting information in a structured manner, I'd try to piggyback on an already hard-core PDF rendering solution which runs in the browser, so see above.

How to convert InDesign IDML to Tiff?

I have a requirement to take idml files provided by a client, twiddle them a bit to fill in some placeholders and generate a TIFF file. This needs to happen automatically and I have InDesign Server at my disposal.
I have the first part down. I have also found how to connect to InDesign Server via SOAP and convert IDML files to hi-res PDF or low-res JPG (This implies a few other other options).
I am at a bit of a loss as to how to take it the rest of the way to generate a TIFF file, the adobe forums have not been much help. It is my impression that this sort of thing is exactly why the IDML format was introduced so I'm assuming there's decent support out there for it but the best I've been able to come up with so far is to go IDML via Indesign Server to PDF (or SVG) via Inkscape Command-line to PNG via System.Drawing to TIFF but that seems horribly contrived and fault-prone (and I have no idea how I'm going to handle multiple pages).
Any ideas?
I don't believe there is a way to export to TIFF via InDesign Server, however I did find this post on the Adobe Forums that suggests using Photoshop to render the Tiff after exporting it as a PDF from IDS. Maybe that would be an option? Otherwise maybe you could use one of the formats that you CAN export from (i.e. JPG, PDF, EPS).
Hope this helps!
For reference, I ended up using Ghostscript to achieve the results.

Add watermark to various documents investigation

I've been asked to investigate the feasibility of adding watermarks to documents when printed through our application. The documents will consist of word, pdf and cad.
The interface of the application is vb6 with a plethora of vc6 dll's.
I can see a couple of possible solutions:
Convert all documents to PDF, add a watermark and then print.
Find a print driver that will add a watermark to all documents prior to printing and install it and reenable it at runtime if it gets disabled for any reason.
3rd Party suites are possibility (we use Volo View Express for viewing CAD files) but since this application is nearing end-of-life we wouldn't want to spend too much on it.
Has anyone had any experience of the above? Any gotcha's that will bog me down?
Tracker Software has a good set of PDF api's that that will allow you to implement the solution you already have in mind. I've used their Image and PDF libraries quite a bit with a lot of success in both VB6 and .NET. Single user licenses are not expensive (depending on how you look at it I guess), and I've found support to be excellent as well.