Fine-tuning ghostscript PDF to PS conversion

Fine-tuning ghostscript PDF to PS conversion - pdf

I have a program that generates a PDF as output. If I send this file to a printer using the Adobe viewer, it prints exactly as wanted. In particular, the application is printing labels and there's a requirement that every last pixel on the page is used, i.e. no margins whatsoever.
I'd like to try and automate this process. GhostScript seemed a logical choice. I used the command lines
gs -dBATCH -dNOPAUSE -sDEVICE=psmono -sOutputFile=A4_300.xxx -sPAPERSIZE=a4 A4_Print.pdf
... or alternatively
gs -dBATCH -dNOPAUSE -sDEVICE=ljetplus -sOutputFile=A4_300.xxx -sPAPERSIZE=a4 A4_Print.pdf
I can send the output file, A4_300.xxx, to the printer via LPR and it almost prints well, but there's about 6-8 mm missing on all sides, i.e. there's a margin being enforced, and the text that should be printing in that area is actually being cut off.
Paper size should be a4, and that much is working correctly. But how can I arrange for the output to fill the whole page?
The output device is "some kind of HP laser printer"; I haven't seen the physical device. A similar printer I tested with was able to process output both for "psmono" (that produced PostScript) and "ljetplus" (binary, but printable).
Any advice, please?

First of all: are you sure that your printer is physically able to print edge-to-edge? Which printer model is it?
It may well be that the printer itself enforces the "missing 6-8 mm on all sides". Since you see the margin "area actually being cut off", it means the printer indeed receives the complete image, but it crops the image to what appears as *ImageableArea keywords in PostScript printer PPDs (PS Printer Description files).
If your printer supports edge-to-edge printing indeed, then you may need to enable it as a default...
...by some semi-secret setting in the front panel menu (if your printer has s.th. like that), or...
...by accessing the web-based printer configuration panel from your computer's browser (should your printer support that), or...
...by logging into the printer via telnet, rsh, ssh or msh (depending on your printer to allow this).
The actual procedure to set this depends on your printer model. It should be described in the printer manual.
If you are unlucky, the device simply doesn't support borderless printing. Then buy or find a model that does what you want ;-)
Update: I had missed your statement "If I send this file to a printer using the Adobe viewer, it prints exactly as wanted." From this I conclude that your printer must indeed be supporting borderless printing.
If your LPR client uses any form of PPD (as is the case if you print via CUPS, f.e.), then check out my hints about modifying PPDs (which also works for Windows systems) here:
"What lpr arguments do I need to print a 1400x800 pixel image on a 4x6 label?"
"What's the easiest way to add custom page sizes to a PPD?"
Most likely you do not need to finetune your Ghostscript output; it is fine as the cropped printouts show.
Most likely you need to tweak your LPR client so that its "driver" does not destroy what you want to send to the printer.

Related

Converting Windows .PRN file (PCL) to PDF

I have been succesfully capturing PCL content sent by old machinery to a parallel port and converting it to PDF using GhostPCL for a while.
However, we have some older industrial machinery which is based on Windows 2000 and outputs to a HP Laserjet printer via the parallel port. Unfortunately, the software on the machine does not allow additional software or printers to be installed.
The problem is that whilst the captured output appears to be PCL graphic data, I have not found any tools which can convert it - GhostPCL attempts, and you can make out the text a little, but it is completely corrupted.
The captured output results in the output from GhostPCL
I can see that the captured output starts with:
ESC E (PCL command for Reset)
ESC &l0L (PCL command to disable skip perforation)
ESC &r1U (*** UNKNOWN ***)
ESC &l1H (PCL command to Feed from tray 2)
ESC *o0M (*** UNKNOWN ***)
ESC &126A (PCL command for A4 portrait paper)
ESC *g8W (PCL command to configure raster data - 8 bytes)
I can see that the captured output has some PCL codes which do not appear in the official documentation, which results in the weird characters at the start of the PDF.
Does anyone know how to convert this file to PDF ?

Your description shows an attempt to read a language different to pcl, so was the older system designed to talk to a default printer using Epson ESCAPE encoding. so its a dot printing file that the hardware will position those dots in the correct place and pressure, you could try converting to a bmp then massage the image into a pdf page.
You say an HP is connected but is that the true capture of what it is agreeing to use at runtime?
For example If I attempt to save my HP inkjet print file at time of printing I will get a PDF ! but why since the printer normally cant handle those direct?
What may I find If I look and see the default printing language is not set to a PCL or a PJL one.
There are many pcl language variations so a PRN file ideally should have some compatibility declaration such as #PJL ENTER LANGUAGE = PCLXL ) HP-PCL XL;2; Note this is NOT an HP printer just one that declares the following code will be HP style.
The printer can accept many formats and the system can produce many different formats for one printer. Thus you need to check all the system settings to understand what language is actually in run time use. Are you sure that .PRN is a full load of the conversation between the system & printer as print to file is not always the expected 2 way code.
The best way to ensure you capture true printer driver output is to change the drivers output PORT to a fixed filename and ideally use the correct format extension NOT unspecified .prn or .pcl if it is not such.

Ghostscript: How to auto-crop STDIN to "bounding box" and write to PDF?

Here have been already quite a few questions and answers about cropping documents with Ghostscript.
However, the answers are not matching my exact needs and are still confusing to me.
I expected that there would be a single option e.g. "-AutoCropToBBox" or something like this.
For clarification, as a bounding box, I understand the smallest rectangular box which contains all (non-white(?)) printed objects completely.
Furthermore, I want/have to use a printer port redirection (RedMon) to generate a cropped PDF via printing to a Postscript-printer from basically any application.
So, under Win7/64bit, I set the redirected port properties:
Redirected port properties Win7/64bit
The output is redirected to
C:\Windows\system32\cmd.exe
The arguments for the program are:
/c gswin64c.exe -sDEVICE=pdfwrite -o -sOutputFile="%1".pdf -
"%1" contains the user input for filename. With this, I get a full-page PDF. Fine!
But how to add the cropping options?
Additional question:
If I have a multipage document will such an (auto-)cropping be individual for each page? Or would there be an option to keep it all the same e.g. like the first page or like the largest bounding box of all pages?
Another related issue:
the window for prompting for the filename is always popping up behind the application I am printing from. Any ideas to always bring it to the front?
Another question:
There is the Perl-script "ps2eps" and program bbox.exe (see http://ctan.org/pkg/ps2eps). It's said there that Ghostscript (or ps2epsi) is occationally(?) calculating wrong bounding boxes. Is this (still) true?
Thanks for your help.

Well your first problem is that PostScript programs are normally written to expect to be rendered to a specific media size, and are usually not tightly bounded to it. White space is important for readability.
So ordinarily the PostScript program you generate will request a specific media size, and the interpreter will do its best to match that. If it can't match it then it will use a strategy to try and get as close as possible, and scale the entire content to fit that media.
You can't expect the printer to perform any of those things if it doesn't know the required size until its finished, and you can't be certain of the bounding box until you have rendered all the marking content. It is true that some files generally EPS files have a %%BoundingBox comment but.. that's a comment, it has no effect in PostScript, its there for the benefit of applications which don't want to interpret the PostScript.
So that's why the simple switch you want isn't there, it would break the interpreter's normal functioning, for rendering.
So, the first thing you need to do is determine the bounding box of the content. You can do that, as Stefan says, by using the bbox device. And on that note, as far as I know the bbox device produces accurate output. If it does not then we would appreciate a bug report proving it so we can fix it. If people don't report bugs how are we supposed to know about them ? Its disappointing to see someone spreading FUD instead of helping out with a bug report.......
ps2epsi isn't Ghostscript, its a crappy cheap and cheerful script, I wouldn't use it. However..... If the original PostScript leaves stuff on the stack then it will end up as a corrupted (or invalid) EPS file and the original PostScript should be fixed before trying to use it as it will break any PostScript program that tries to use it (eg if you include the EPS in a docuemnt and then print it).
So if you are using Ghostscript, and you want to take a PostScript program and get an EPS out of it, use the eps2write device. It won't have a preview bu frankly who cares.
Now if I remember correctly the bbox device (and eps2write) record all marking operations, you can't simply record all the non-white marking operations; what if the white overwrites an existing mark on the page ? What if the media is not white ? Note that if you render to a PNG with Ghostscript, the untouched portion of the output is transparent, whereas white marks are not.
So the bbox is the extent of all the marking operations, regardless of the colour. The only other way to proceed would be to render the content and count the non-white pixels. But that only works at a specific resolution, change the resolution and the precise bounding box may change as well.
Once you have the Bounding Box you can tell Ghostscript to use media that size. Note that you will almost certainly also have to translate the origin, as its unlikely that the content will start tightly at the bottom left corner. You will need -dDEVICEWIDTHPOINTS and -dDEVICEHEIGHTPOINTS to set the media size, and you will need to use -c and -f to send PostScript to alter the origin appropriately. In simple cases an '-x -y translate' will suffice but if the program executes initgraphics you will instead have to set a BeginPage procedure to alter the initial CTM.
If you set the media size with -dDEVICEWIDTHPOINTS etc then all pages will be the same size. If you don't want that then you need to write a BeginPage procedure to resize each page individually (you will also need to hook setpagedevice and remove the /PageSize entries from the dictionary.
I've no idea why Windows is putting the dialog box behind the active Window, it seems to have started doing that with Windows 7 (or possibly Vista). I don't see any way to alter that because I'm not sure what is generating the dialog.....
Personally I would suggest that you try the 2-step approach of running the original through Ghostscript's eps2write device and then take the EPS and create a PDF file using the pdfwrite device and the -dEPSCrop switch. Double converting is bad, but other solutions are worse. Note that EPS files cannot be multi-page, so you will have to create 'n' EPS files from an n-page PostScript program, and then supply a command line listing each EPS file as input to the pdfwrite device.
Take an example file and try this out from the command line before you try scripting it.

As I understood from #KenS explanations:
the way eps2write works, it may not or will not or actually cannot result in the minimum possible bounding box
it needs to be a 2-step process via -sDEVICE=bbox
So, I now ended up with the following process to "print" a PDF with a correct minimum possible bounding box:
Redirected Printer Port to cmd.exe
C:\Windows\system32\cmd.exe
Arguments for the program:
/c gswin64c.exe -q -o "%1".ps -sDEVICE=ps2write - && gswin64c.exe -q -dBATCH -dNOPAUSE -sDEVICE=bbox -dLastPage=1 "%1".ps 2>&1 >nul | perl.exe C:\myFiles\CropPS2PDF.pl "%1"
Unfortunately, it requires a little Perl script (let's call it: CropPS2PDF.pl):
#!usr/bin/perl -w
use strict;
my $FileName = $ARGV[0];
$/ = undef;
my $Crop = <STDIN>;
$Crop =~ /%%BoundingBox: (\d+) (\d+) (\d+) (\d+)/s; # get the bbox coordinates
my ($llx, $lly, $urx, $ury) = ($1, $2, $3, $4);
print "\n$FileName: $llx, $lly, $urx, $ury \n"; # print just to check
my $Command = qq{gswin64c.exe -q -o $FileName.pdf -sDEVICE=pdfwrite -c "[/CropBox [$llx $lly $urx $ury]" -c " /PAGE pdfmark" -f $FileName.ps};
print $Command; # print just to check
system($Command); # execute command
It seems to work... :-)
Improvements are welcome.
My questions are still:
Can this be done somehow without Perl? Just Win7, cmd.exe and Ghostscript?
Is there maybe a way without writing the PS-File to disk which I do not need? Of course, I could also delete it afterwards with the Perl-script.

Why do I obtain countless 'programming' pages of characters/numbers when printing pdf/png files using lpr?

I've got a silly problem which is literally driving me mad:
When I try to print a file using lpr file.pdf depending on the file I obtain one of the following issues:
the printer does not recognise the A4 format
the file is printed but together with a countless number of pages of programming code ( the 'real' face of a PDF file I guess), characters and numbers.
The same happens also for PNG files.
I'm using MAC OS X El capitan and a Xerox colorQube printer.
Clearly if I open the file with Acrobat or Preview and just make the printing manually I have no problem at all.
I hope you can give me some clues because I couldn't find anything useful on the web.
PS: If I use the option -l the printer prints a sheet saying that the printer is not configured to print pdf files directly.

lpr sends file directly to printer, it may not understand pdf as-is, but since pdf is a successor to postscript - it can contain familiar commands so something gets printed, but the rest, probably the embedded preview and so on - gets printed as raw text
Try using ghostscript to convert to postscript before sending to printer:
gs -dSAFER -dNOPAUSE -sDEVICE=(your printer name) -sOutputFile=\|lpr file.pdf

PDF to PostScript Using Ghostscript: large files having issues printing

I'm currently using Ghostscript to convert 500 page PDF files into PostScript.
I'm using Windows 7, Ghostscript x64 v 9.16, and a Kodak Digimaster Commercial Printer.
I use the following arguments for GhostScript to convert a PDF into PS:
C:\Program Files\gs\gs9.16\bin\gswin64c.exe"
-dCompressFonts=true
-dSubsetFonts=true
-dEmbedAllFonts=true
-sFONTPATH=C:\Windows\Fonts\
-dNOPAUSE
-dBATCH
-sDEVICE=ps2write
-sOutputFile="PostScript.ps"
"MyPdf.pdf"
I then add %KDK (proprietary) commands to dictate which pages need to print on which paper by using the %KDKSlip command based on the Printer documentation.
The example below would print all pages on Letter duplex except for pages 1/2 and 5/6. Pages 1/2 would print on a paper defined under the name of "YellowPerf", while 5/6 would print on "TriPerf":
%!PS-Adobe-3.0
%%BoundingBox: 0 0 612 792
%%HiResBoundingBox: 0 0 612.00 792.00
%%Creator: GPL Ghostscript 916 (ps2write)
%%LanguageLevel: 2
%%CreationDate: D:20150506143059-05'00'
%%Pages: 8
%%DocumentMedia: Letter 612 792 0 white ()
%%+ YellowPerf 612 792 0 yellow ()
%%+ TriPerf 612 792 0 white ()
%KDKRequirements: duplex
%KDKSlip: YellowPerf duplex 1
%%+ TriPerf duplex 5
%%EndComments
%%BeginProlog
This is then sent to a Kodak Digimaster printer using a Windows command:
> COPY PostScript.ps PrinterName
This has worked fine with smaller documents, but I'm having issues with larger page sets.
When I attempted to print to the Digimaster using a 500 page PDF to Postscript file, it had errors occur: "Busy, do not reset the RIP".
File size of those that didn't work:
PostScript File Size: 52 MB
PDF File Size: 41 MB
File sizes of those that did work:
PostScript File Size 1MB
PDF File Size: .8 MB
Why does this work fine with smaller files but get hosed on larger files?
Would anyone have any advice?

It is not necessarily the filesize of the PostScript that causes your problem:
It could be the PostScript itself, or
it could be that you made a mistake with your editing of the PS files when you inserted the (proprietary) %KDK-comments.
Are you sure your text editor doesn't silently change your linefeed characters?! This could also change the binary parts of the PostScript!
Also, I'm not sure if the copy command does handle print jobs like it should. I would prefer the lpr command (ah... is that even still available on your version of Windows?!)
To debug this and to explore a few different roads to successful printing, I would try a few different steps:
To debug
Send the original PostScript, without the added %%KDK DSC header comments, to the printer.
That printer model has a nice feature you can utilize: you can check if its RIP processes the input file completely and successfully without needing to output your 500 pages on (wrong) paper and waste it therefore (you'd also need to discard it afterwards -- too much work too). Just click the red "Stop" button on its user interface monitor.
Does that one complete the RIP process successfully?
Yes? Now you can now even print it. Before you do so you can even modify the job settings to select a particular paper tray, by clicking on some button on the interface (can't recall the exact button label though). Then "release" the job and it will print.
If it worked, you can again turn your attention to get your %%KDK lines right.
If it didn't you have to try another route.
Check if a different PDF-to-PS converter is working
Create a PostScript file with the help of pdftops (see here for the pdftops.exe version -- read the README to see which options are available).
Proceed analog to above: first see if it completes the RIP process. Then continue with your %KDK manipulations....
Check if the direct PDF printing is working
The Digimaster model can consume PDF directly. (Well, internally it uses its own PDF-to-PS converter, but that isn't visible to the outside -- so it doesn't really count as a PDF RIP...)
If that works, you can even prepend your appropriate %KDK comments to the PDF file, similar to the lines below (don't rely on me getting the details right, it's from the top of my head, and memory is decades old!):
%!PS-Adobe-3.0
%%.........................
%%DocumentMedia: ..........
%KDKRequirements: .........
%KDKInserts: ..............
%KDKSlip: .................
%KDKBody: .................
%KDKCovers: ...............
%KDKPDFPrintAnnotations: on
%KDKPDFFitToPage: on
%KDKBinaryOK: on
<esc>%-12345X
%%Emulation: pdf
%PDF-1.5
%...here follow the lines of the original PDF file...
...
Send jobs via "Kodak Printfile Downloader" (KPD)
For Windows there used to be the so-called 'Kodak Print File Downloader' (KPD). The KPD is an application, not a printer driver. Not sure if it is still available.
You could open its GUI, then load a PS, PDF, PCL or TIFF file into its to-be-printed-list of jobs. Then select a few job options (like trays, stapling, sorting etc.). Lastly, send the job off to the Digimaster...
The KPD essentially does the same thing, as you want to achieve: insert %KDK commands into the file header. But you want to do it with a script or an editor (and possibly automatically via a batch process, once it works).
The KPD requires interactive user activity and cannot be scripted.
But you can (ab-)use it to intercept the files it creates from the Windows spooling system, study them and then adapt your scripted efforts so that they also work....
Update
(I had wanted to add this already in my initial answer. But time ran out, so I skipped it for the time being..)
Observe the RIP processing directly at the printer UI
Digimaster printers have their own built-in touchscreen or flatscreen or tube monitor (depending on the age of the model). They also typically have a full-time operator who knows the machine and its tweaks and peculiarities quite will. The machine may be quite a distance from the user sending a job.
So the following should be done when debugging a print problem:
Ask the operator to set the printer to "stop printing", but still "receiving new jobs".
Submit any job(s) you want.
Walk up to the printer and its operator.
Release the job for RIP-ping and observe what happens:
You may see everything going alright and completing until the last page (you know how many pages you submitted, right?)
Or you may see the job aborting at a certain page number.
Or you may see the printer RIP chewing extremely long on a certain page (or several pages), but finally completing the job.
Or you may see the printer RIP hanging with a certain page forever.
Or...
In any case, the details which are observable here may give important clues about where to look next...

Difference when printing PDF file to different printers

I'm generating PDF files and they look OK in Acrobat Reader. However, when I try to print documents to different printers, selecting the same "Fit To Page" scaling option, the results are a bit different.
One printer leaves a bit different page margins and space between characters.
Is it generally possible to achieve exactly the same looking printouts on different printers with the same scaling options? If yes, is there some sort of check-list for that?

You can never guarantee the output from device to device. This is because the printer interpret what you send it, and often uses built-in fonts and other resources that might be slightly different from manufacturer to manufacturer. You could do the following to help:
Embed all fonts into the PDF (including system fonts)
Force the printer driver to download all fonts and not use the ones internal on the device
Whenever possible, don't scale, send it down in the native size. This might mean that you have to make the output small enough to fit in the print window of a wide range of printers
The other thing to keep in mind is that when printing PDF files, if you use Postscript, there is a high likelihood that Acrobat Reader will do the PDF to PS conversion and pass it through the driver. You probably see an option in the print driver relating to this. Leaving it on pass-through should give you more consistent results.
There is also an option in Acrobat Reader to "print as image". This will slow printing dramatically but could assist.
If you have an application that is sending this file to print keep in mind that many newer printers support native PDF printing. This means you can simply fire the PDF at the device itself and not have to view or print it from Acrobat. In this case you are doing one less conversion / interpretation, so it could help.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas