How can I replace Image in PDF Programmatically (using command line ideally) - pdf

I'm looking for a method to replace an image in an existing PDF file (setup as a template).
In a perfect world I could just run a command (linux or windows will work, I'm not picky) at the command line, however, if need be, I could also implement something using a scripting language or even a full blown program at this point (I have visual studio).
I honestly can't believe how hard it has been to find an example of this being done somewhere.
We're running Windows w/ Cygwin and have Acrobat Pro XI as well as Illustrator CS6 from which the PDF file is originally created.
Essentially what we're looking for is an efficient way to replace images for PDF files that we're sending to the printing house.

Related

Issue with ghostscript rendering PPT into PDF

I've been tinkering with Ghostscript with a port monitor(on a HP PCL 6 Universal driver) to convert print job into PDF. I've tested with a few applications such as Words, Excel, Adobe Reader, Microsoft Edge etc and they are all working properly.
However upon testing Microsoft Powerpoint 2016, it seems like there are some graphics that are unable to be rendered properly through Ghostscript.
Actual Slide Below
Output From Ghostscript in PDF Below
I've tested this even with some other PDF generators such as BioPDF,CutePDF as well as AdobePDF and they would all result in the same output as above.
Just wondering has anyone tried and have faced similar issues before? if so could someone point me in the right direction??
What you are doing isn't a single step PowerPoint to PDF and Ghostscript is not rendering the PowerPoint. In fact if you are creating a PDF file Ghostscript isn't (ideally) rendering anything.
What's actually happening is that you are asking PowerPoint to print to a canvas, which is then passed to the PostScript printer driver. That produces PostScript which is sent to the Port. Your (and others) Port Monitor then sends the PostScript to the 'Distiller' (in your case Ghostscript and the pdfwrite device). The Distiller reformats the vector drawing commands into a PDF format and builds a PDF file from them. It doesn't render (turn into a bitmap image) anything unless forced to.
Obviously there are several places along that road where the problem could creep in. Given that you say that the Adobe product (the others you mention al use Ghostscript) has the same problem, I think its safe to assume that the problem isn't Ghostscript.
This also means that you aren't using the driver you think you are. Adobe can't handle PCL as an input medium as far as I'm aware, and nor can Ghostscript. GhostPCL will handle PCL as an input, but that's not what you say you are using.
Of course you haven't linked to an example file to demonstrate the problem, nor supplied an example command line, so this is all supposition.
Now if, somehow, you are using a PCL6 device, then the problem is most likely due to the presence of rasterOps in the output. Rasterops are part of the PCL imaging model which do not exist in PDF and are a form of transparency. There are three ways to handle such content for a PDF output device; firstly render the whole page content to an image, secondly ignore the rasterOps objects, thirdly treat the rasterOps as opaque.
GhostPCL and the pdfwrite device take the third option. So, its just conceivable that your original content has some transparent objects which are being handled as rasterOps by the PCL printer driver, and then rendered as opaque by GhostPCL and the pdfwrite device.
If that's somehow the case then the solution is simple; don't use a PCL printer driver, use the PostScript one.
If you post a link to a (simple, eg single page) example of what you are sending to Ghostscript, and a command line, then I can look at it. Please don't send me the PowerPoint, I can't use it and even if I could, my print setup would not match yours. I need the data being sent to Ghostscript.
[EDIT after looking at files]
Don't mean to sound like I'm lecturing, the problem is people find these result on Google searches and then try to apply them based on a poor understanding of what's happening. So I find it best to be really clear in my answers about what's going on. It saves questions later :-)
The first thing I see is that the PCL is indeed PCL, and if you try running that through Ghostscript it throws horrible errors and exits. So presumably you aren't doing that.
The PostScript file contains nothing except huge images, rendered (presumably at 600 dpi) contains 2 pages, the two pages look like your images above. Which is why the PostScript is better than 20 times larger than the PCL file.
But.... If I open the .ppt file with OpenOffice (4.0.0 is what I have to hand) I see exactly the same thing. I don't, I'm afraid, have a copy of Microsoft PowerPoint, but from what I see here there are two conclusions;
firstly that the PDF I get looks pretty much like the PowerPoint when viewed with OpenOffice at least. So there's something 'interesting' about your PowerPoint.
secondly, even if that's not what you expect, its what's in the PostScript program. That means that either PowerPoint rendered the slide to a bitmap or the Windows printing system/HP driver did.
Now, if I run the PCL through GhostPCL instead of Ghostscript (rendering, not producing a PDF) then the result is more like what I think you are expecting. However, when sent to a PDF file the result is horrible. Which strongly suggests to me that there's some form of transparency involved, PostScript doesn't support transparency at all, and PCL does it through rasterOPs.
I'm afraid that this means that the problem lies either in PowerPoint, the Windows print system or the PostScript printer driver you are using. Since the PCL is at least close to what you expect, I suspect that this means PowerPoint is doing the right thing, and its the printer driver messing up. It appears you are using the Windows PostScript printer driver.
So there's no way you can 'fix' this for files like this, at least not with Ghostscript. You would need to 'fix' the Windows PostScript printer driver, or possibly the Windows print system. You could try reporting a bug to Microsoft, presumably these files print incorrectly when sent to physical PostScript printers too.

Force a webbrowser to display a PDF file only on Adobe Acrobat Reader

I create a PDF with iTextsharp and then I show the preview of the PDF inside a webbrowser control. From the preview the user can SAVE or PRINT using the defaults Adobe Reader's buttons
Working on Windows x64 bits with Adobe Reader as the default PDF viewer everything works fine.
The same program on a Windows x64 bits but with Foxit Reader as the default PDF open the file on Foxit Reader on full application window, outside my program.
I need to manage that.
My code is like
Dim PathToPDF As String
PathToPDF = DirectoryOfMyApp & "\ReportPreview.pdf"
ReportPreviewWebBrowser.Navigate(PathToPDF)
Where DirectoryOfMyApp just gets the C: or D: letter of the hard disk.
I read this link
How to start an Adobe Reader or Acrobat from VB.NET?
but a line like
ReportPreviewWebBrowser.Navigate("acrobat", PathToPDF )
didn´t work and I think the webbrowser control don´t have the option to choose the PDF viewer
https://msdn.microsoft.com/es-es/library/system.windows.forms.webbrowser(v=vs.110).aspx
Is there a way to set the webbrowser to use Adobe Acrobat Reader only or to force any other PDF viewer to show the PDF inside the webbrowser control?
I agree with Zaggler on his comments on this. You are making assumptions at a certain point on software that is installed on an end user's computer. Unless you are going to make the application's PDF viewer be part of a dependency installation or cooked into .NET you cannot guarantee they have that program to use. Nor can you guarantee it's installed location.
However there is a cheap hack for Windows based processes you can do in VB.NET. You can use the ole System.Diagnostics.Process()
Sub Main()
Dim nProcess = New System.Diagnostics.Process()
nProcess.Start($"D:\PdfFile.pdf")
End Sub
In this example I did a quick file location, you can try to ensure it is a valid location that will not change or is in your app's running process folder. This is really low tech as far as development goes, but it is basically saying: "Run me a process, any process, at this location. I don't care what it is, use the Windows defaults to determine what to do with it."
So when I run this on my Windows 10 Dev box it loads up Edge to display it, at home it would fire up Adobe Viewer. It is just opening the file essentially with the OS's choice of what is using that file extension. Not glamorous or very good for hardened code but it works when you want something quick to happen.
No, you can't do this.
You can't even guarantee that Adobe Reader is installed at all.
Reader might not even exist on the machine. It's not built into Windows, and not everyone uses it. Even if it is, FoxIt isn't the only alternative. A big one is that Chrome includes it's own PDF viewer.

Killing Adobe Acrobat Viewer Process to free pdf

I have an app that uses a webbrowser window to view several types of documents. One of those document types is a pdf file. After viewing this file and selecting certain criteria in the form it is uploaded to a database. What I would then like to do is more the file to a temp directory that I've already created. Problem being is that a process is still holding that pdf even though I have removed the pdf from the webbrowser. So apparently the problem is that adobe acrobat is running in the background and still retaining the file.
I did fix it.. but not in a way to my satisfaction. Essentially I ran a shell script in the program and killed off AcroRD32.exe. This frees the file and gives me the ability to move. I really dislike having to do this.. not very elegant whatsoever. Is there a better way to handle this situation?

Exported PDFs from Mathematica 8 won't print

UPDATE: I wrote to Wolfram support about this and will update the post if they can resolve the problem. Sorry for spamming SO with a technical support question, but here it remains in case anyone else is having the same issue.
Is anyone else having this problem with Mathematica 8? I recently upgraded and noticed that when I export Graphics to a PDF file, although the file appears fine on my computer, it prints as a blank page. For example, try
Rectangle[{1,1}]//
Graphics//
Export["~/test.pdf",#]&
which creates a PDF file containing a black square. This file opens fine, but if I send it to my department printer I just get a blank page. If I don't export the graphics but print the notebook from MM, no problem, the graphics print as expected. If I use MM 7 to do exactly the same thing, the PDF file prints as expected. Exporting to PNG in MM8 seems to work fine. And, using the context menu Save Graphics As ... or File > Save Selection As ... to create a PDF containing just the graphic also works. However, these graphics eventually get included in a TeX document, and it would be far better if I could continue using the script I've got that doesn't require any button clicking to generate them.
I'm running MM 8.0.0.0 on Mac OS 10.6.7. I have not been able to test this on another printer yet, but this printer has never given me problems before and prints other PDF documents fine. Any ideas why this is happening?
Wolfram Research responds:
...
This issue has been reported by other users as
well and our developers are currently looking into it. I have added your
details to the report so you can be notified when this is resolved.
In the meantime, the alternatives that you could try are:
Try a different printer.
Rasterize the image with the function 'Rasterize' before exporting. If
the rasterized image loses some resolution, you could use the option
'ImageResolution' to edit this.
Rasterize[image, ImageResolution -> xxx]
Surely this is a bug (please report it to support#wolfram.com), but you can work around the problem by selecting the graphic and choosing File > Save Selection As... from the menu (or Save Graphic As... from the contextual menu). This produces a slightly different file that doesn't appear to exhibit the undesirable behavior we observe from Export[].
These problematic files, and LaTeX PDFs that include them, can be properly printed by Adobe Reader 10.1.2. That's if you're okay with installing and using a 450MB PDF reader.
I reproduced the problem (leading me to this question) with Mathematica 8.0.4.0 on Mac OS X 10.7.2. Wolfram suggested lame workarounds like Rasterize and told me
This issue has been addressed by our developers, and a fix will be included in a future version of Mathematica.

PDF Outline Text - Automation of Acrobat Sequences

I have built an application that automates the filling out of form fields inside a pdf. It then takes various assets and combines them together to generate a "print ready" product. All of this is accomplished using the magic of iTextSharp. When form fields are populated, they are then flattened to text. The problem is that even with the fonts embedded they aren't really attached to the form fields in a meaningful way (like straight text elements are) and the printers are complaining that the pdf is generating licensing errors due to this. I researched this a bit and it just seems to be the nature of how form fields are.
The artists we are working with requested that we research a way to "outline" the text that is created from flattening the form fields. I found that running the PDF Optimizer with a custom preset allows for Text Outlining in Acrobat, and even better I can generate an Acrobat Sequence that runs this command on the pdf. The problem is that Sequences can not be automated, at all.
I found a plug-in called AutoBatch that allows for the execution of Sequences on the command line through a batch file. The downside is that this would require installing Acrobat Pro and the Plug-in on the server this application will be running on. Further it seems like an overkill solution just to outline the text in the pdf. For all I know at this point iTextSharp may allow me to do this programmatic, but searching for such a thing on google returns little results and nothing relevant.
So the question: Is there a better way to outline text in a pdf than the current solution I have implemented or am I kind of stuck?
TLDR; PDF is generated w/ non-standard fonts. I need to "outline" this text to send it to the printer. Currently using AutoBatch Acrobat Plug-In to execute Acrobat Sequence from the Command Line. Seems excessive, wondering if anyone knows a better way to automate font outlining.
I am also in a printing environment and have used forms for "Box Covers" plenty of times to shorten the code used to produce box covers.
I simple us "pdfStamper.FormFlattening = true;" and the printers (Xerox DP180 and DC5000) has no problems in using the PDF.
The moment I leave out FormFlattening the printer gives a lot of errors regarding the PDF.
If you are using FormFlattening then check if the printer has the font locally installed in order for it to reference the font from the print engine instead of the PDF resources.