Asymmetrical Groff output - groff

For years, I've been generating tables created with raw Groff commands. All I did was
groff -t file >file.ps, and I got what I wanted. The administrator upgraded gnu utilities, and now the same scripts produce output on a page that seems longer and narrower than before. The output is now asymmetrical. What do I do?
Thank you,
Robert Katz

Related

Ghostscript on Unix generating huge files

I use Ghostscript 9.14, the last one compiled for HP-Unix.
I need to create PDF/A-1b files from existing pdf files from different sources.
It is preferred that this happens on a HP-Unix server because that is the server that puts them in a DMS.
The command:
gs -q -dPDFA -dBATCH -dNOPAUSE -dNOOUTERSAVE \
-dCCFONTDEBUG -dCFFDEBUG -dCMAPDEBUG -dDOCIEDEBUG -dEPSDEBUG \
-dFAPIDEBUG -dINITDEBUG -dPDFDEBUG -dPDFOPTDEBUG -dPDFWRDEBUG \
-dSETPDDEBUG -dSTRESDEBUG -dTTFDEBUG -dVGIFDEBUG -dVJPGDEBUG \
-dColorConversionStrategy=/sRGB -dProcessColorModel=/DeviceRGB \
-sDEVICE=pdfwrite -sPDFACompatibilityPolicy=2 \
-sOutputFile=debug_0901ece380001a00.pdf /usr/../PDFA_def.ps \
/0901ece380001a00.pdf
The source pdf is filled with just non-OCRed images.
I have this working on a newer version on a Windows server (Ghostscript 9.19) without problems and with the same command but can't seem to get it working on HP-Unix.
On the Windows server there is a MS Office installed.
The HP-Unix command generates 9mb file for a 300kb source file and it takes ages to generate.
Ghostscript seems single threaded but 9 mins for 35 pages is a bit much.
When I check through Preflight in Acrobat Pro 9 Extended, the 9mb file is truly PDF-A 1b.
Do I need to install a kind of Office software on Unix to get this working?
Or an image editing tool?
Also, how do I check the debug lines? They aren't in a readable format and I can't find any info on that.
Maybe it is something that only can be checked by the Ghostscript developers?
Almost certainly the input file contains transparency. PDF/A-1 does not support transparency, and so when creating PDF/A-1 files any page which does contain transparency is rendered to an image, and then that image is embedded in the output.
Clearly this will take time (rendering a page at 720 dpi, full colour, and transparency processing is slow) and will result in a large file. However, its the only way to preserve the appearance of the input file and still create a PDF/A-1 file.
Of course, in the absence of an example input file to examine its not possible to be certain of this.
The DEBUG lines switches are useless except to Ghostscript developers, don't bother to set them. You would never set so many anyway, you'll be swamped with extraneous detail. I'm doubtful all the ones you have listed are even valid.
You say you have this 'working' with Ghostscript 9.19 on Windows, what do you mean by 'working' ? It seems to me that the 9.14 output 'works' as well.....
As far as I know we have never compiled a Ghostscript release for HP/UX, but the current version (9.22) is known to compile and run on HP/UX.
Finally Ghostscript does not rely on (and indeed cannot make use of) Microsoft Office. Nor does it rely on the operating system for anything except memory and file access.

Whats a better way to open man pages?

I'm always forgetting to read man pages for unix commands during my work as a developer. I don't like the clunky man page interface and it's hard to get to the parts I need like a useful example of how to use the command.
Is there a better way? My chosen IDE is PHPStorm in case that is relevant to your answer. My daily OS is OSX 10.8.5
You can pipe | the output of a command (Like a man page) to your IDE. I use Sublime, so for me, to read the man page for ls, I would do...
man ls | sublime
If for some reason your IDE can't do that, you can just write the output to a .txt file and read it there using any of the "write" operators. Here's an example of those.

Converting correctly pdf to ps and vice-versa

I'm using "pdftops" to convert .pdf files to .ps files and then "ps2pdf" for the reverse process (poppler-utils). The problem is that when creating the .pdf files from the .ps files, the text looks ok, but when i try to copy it, the characters are very strange (it's like they are corrupted). I used these tools on other files for a long time and it worked fine.
I also tried "pdftohtml -xml" to create an .xml file, and the text is ok (the characters are extracted correctly).
What problem could it be regarding the conversion? Maybe if I use "pdftops" and "ps2pdf" are there some options that need to be changed?
If I create the .xml output, is there a way to create a .pdf file from the .xml file ?
EDIT:
Output for "pdffonts original.pdf"
Output for "roundtripped.pdf"
I'm just covering the PS->PDF conversion... (I'm assuming your phrase of vice-versa isn't meant to point to a 'round-trip' conversion of the very same file [PDF->PS->PDF], but the general direction of conversion for any PS file. Is that correct?)
First of all, most likely your ps2pdf is only a shellscript, which internally uses a Ghostscript command with some default parameters to do the real work. ps2pdf is much easier to use. Ghostscript has many more options, but it is more difficult to learn. ps2pdf it takes away a lot of potential control you could have if you used Ghostscript. (You can tweak a few parameters with ps2pdf -- but then you are already so much closer to run the real Ghostscript command already...)
Second, without exactly knowing how exactly your PS input file is conditioned, it is difficult to give you good advice: Does your PS have embedded the fonts it uses? Which type of fonts are they? etc.
Thirdly, Ghostscript gained a lot of additional power and control, and had a few bugs or weak spots removed over the last few years when it comes to outputing PDF. So, which is the version of Ghostscript installed on your system? (Remember, ps2pdf calls Ghostscript, it will not work without a locally installed gs executable.)
One likely cause for your inability to copy text from the PDF could be the font type (and encoding) that ended up being used and embedded in your PDF file. Which font details can you tell us about your resulting PDFs? (Try pdffonts your.pdf to find out -- pdffonts is also part of the Poppler utils you mentioned.)
You may try this (full) Ghostscript command for PS->PDF conversion and check where it takes you:
gs \
-o output.pdf \
-sDEVICE=pdfwrite \
-dPDFSETTINGS=/prepress \
-dHaveTrueTypes=true \
-dEmbedAllFonts=true \
-dSubsetFonts=false \
-c ".setpdfwrite <</NeverEmbed [ ]>> setdistillerparams" \
-f input.ps

Gluing (Imposition) PDF documents

I have several A4 PDF documents which I would like (two into one) "glue" together into A3 format PDF document. So I will get from 2PDFs A4 a single one sided PDF A3.
I have found the excellent utility PDFToolkit and some others but none of them can be used to "glue" side by side two documents.
I just came across a nice tool on superuser.com called PDFjam that can do all of the above in a single command:
pdfjam --nup 2x1 file1.pdf file2.pdf --outfile DONESKI.pdf
It has other standard features like page size plus a nice syntax for more sophisticated collations of pages (the tricky page re-ordering necessary for true booklet-style page imposition).
It's built on top of TeX which is, whatever it is. Installing is a breeze on Ubuntu: you can just apt-get install pdfjam. On Mac OS, I recommend getting BasicTeX (google "mactex basictex"; SO thinks I'm a spammer and won't let me post the link).
This is a lot easier and more maintanable than installing both pdftk and Multivalent (on both Mac OS for dev and Ubuntu for deploy), which wasn't going so well for me anyway...!
Found the following (free and open-source) tool for doing Imposition called Impose (thanks danio for the tip). This solved my problem perfectly.
EDIT:
Here is how it's done:
Use PDF Toolkit to joint two PDF files into one (two A4)
pdftk File1.pdf File2.pdf cat output OutputFile.pdf
Create from this a single page (one A3):
java -cp Multivalent.jar tool.pdf.Impose -dim 2x1 -verbose -paper-size "42.2x29.9cm" -layout "1,2" OutputFile.pdf
I would like to advertise my pdftools
It's written in Python so should run on any platform. It's a wrapper to Latex (the pdfpages packages) but can do lot of things with a single command line: merge pdf files, nup them (multiple input pages per output page) and number the pages of the output file (you specify the location and the format of the number)
It still needs some work but I think it's quite stable to be usable right now :)
This puts two landscape letter pages onto a single portrait letter sheet, to be "bound" (i.e., folded) along the top.
pdftops $1 - |
psbook |
pstops -w11in -h8.5in '4:1#.65(.5in,0in)+0#.65(.5in,5.5in),2U#.65(8in,5.5in)+3#.65U(8in,11in)' |
ps2pdf - $(basename $1 .pdf).psbook.pdf
By the way, I do this often, so I'll probably submit more "answers" to this question just to keep track of successful pstops pagespecs. Let me know if this is an inappropriate use of SO.
A nice, powerful, open-source imposition tool is included
in the PoDoFo package:
http://podofo.sourceforge.net/
It works for me. Some imposition plans can be found at:
http://www.av8n.com/computer/prepress/
PoDoFo can do lots of other stuff, not just imposition.
Another useful imposition tool is Bookbinder (on the
quantumelephant site). It has a GUI that appeals to non-experts.
It is not as flexible or powerful as PoDoFo, but it can do
imposition.
pdftk is more-or-less essential to have, but it will not
do imposition.
pdfjam is useless to me, because there are a wide range of
valid pdf files that it cannot handle.
I've never been able to get multivalent to work, either.
What you want to do is imposition. There are commercial tools to impose PDFs such as ARTS crackerjack and Quite imposing but they are pretty expensive (US$500), require a copy of acrobat professional and are overkill for imposing 2 A4 pages to an A3 sheet.
On the Postscript side, a tool named pstops is able to rearrange pages of a Postscript file in any way you could imagine. I've not heard of such a tool for PDF. But pdf2ps and ps2pdf exist. So a not-so-ideal solution may be a combination of pdf2ps, pstops and ps2pdf.
I would combine the two A4 pages into one 2-page PDF using pdftk. Then Print to PDF using something like PrimoPDF, and tell it to print to A3 format, two pages per side.
I just tested this printing some slides from PowerPoint. It worked great. I selected A3 as my paper size in PowerPoint, and then chose to print 2 pages per side. Printed to Primo and voila, I have two A4 slides per A3.
You can put multiple input pages on one output page using BookletImposer.
And you can change page orders and combine multiple pdf files using PDF Mod.
With these two tools, you can do almost everything you want with pdf files (except editing their content).
I had a similar problem. I tried Impose but it was giving me an
Exception in thread "main" java.lang.NoClassDefFoundError: tool/pdf/Impose
Caused by: java.lang.ClassNotFoundException: tool.pdf.Impose
(...)
Could not find the main class: tool.pdf.Impose. Program will exit.
I then tried PDF Snake which isn't free or open source, but has a completely unrestricted 30-day trial version. It worked perfectly, after tweaking the parameters to achieve what I wanted. It's a great tool. I would definitely buy it if it wasn't so expensive! Anyway, I thought I'd leave my 2 cents in case anyone had the same problem I had with Impose.
look at this
http://sourceforge.net/projects/proposition/
It needs laTex to run,
but when it does, works really fine
Regards

Converting PCL to PDF

I am looking to create (as a proof-of-concept) an OCaml (preferably) program that converts PCL code to PDF format. I am not sure where to start. Is there a standardized algorithm for doing so? Is there any other advice available for accomplishing this task?
Thanks!
Conversion of PCL to PDF can be incredibly complex (assuming you need it to be generic and not just for simple PCL). We've investaged this many times and in the end always revert to using other tools. We keep investigating as we are a development shop who uses and understands all elements of PCL to great detail. If you are not really familure with PCL it will be daunting task. One of the major issues is that overtime, printers have become, for the most part, tollerent of malformed PCL and as such, creating something that follows the rules to the letter of the law is not always sufficient. If; however, you have control over the PCL, you may be able to work it out with some amount of success.
I don't mean to turn you off of this and I realize that you've come here looking for a programming answer but I have to say, this is a far from simple task and there are no 'standarized algorithms' for this (that I'm aware of).
If this is designed to be a tool to work alongside of somehting else you are building I'd highly recommend looking at these guys:
PageTech
This is by far the most complete set of tools (Windows) for handling this. There are a few others but, based on our extensive use of PCL and conversion tools over the years, this is the only one that work all the time.
EDIT: Most recently we've been working with LincPDF (http://www.lincolnco.com/). This is also an excellent product with has one big benefit, deployment is simple. Some of the other tools have complex software installations. This solution is very easy for us to deploy as a feature in an application. It's also faster then any tools we've tested to date (at least with the PCL that we generate from our apps which is quite complex as they include specialized fonts and macros).
Ghostscript developers have recently integrated their sister products GhostXPS, GhostPCL and GhostSVG into their Ghostscript source code tree. (It's now called GhostPDL.) So all of these additional functionalities (load, render and convert XPS, PCL and SVG) are now available from there.
This means you could build their language switching binary from their sources. This, in theory, can consume PCL, PDF and PostScript and convert this to a host of other formats. While it worked for me whenever I needed it, Ghostscript developers recommend to stop using the language switching binary (since it's 'almost non-supported' -- see KenS' comment to this answer) and instead switch to using the explicit binaries pcl6.exe (PCL input), gsvg.exe (SVG input, also 'almost non-supported') and gxps.exe (support status unclear to me).
So to 'convert PCL code to PDF format' as the request areads, you could use the pcl6 command line utility, a sister product to Ghostscript's gs/gswin32c.exe.
Sample commandline:
pcl6.exe \
-o output.pdf \
-sDEVICE=pdfwrite \
[...more parameters as required (optional)...] \
-f input.pcl
Updated as per KenS' hints in the comment....
There is a series of reference books from HP; you could re-implement a PCL parser and output corresponding PDF.
You might start with the "PCL 5 Printer Language Technical Reference Manual" (http://h20000.www2.hp.com/bc/docs/support/SupportManual/bpl13210/bpl13210.pdf) . Search HP for more (http://search.hp.com/query.html?qt=PCL+reference).
Or you could steal code or ideas from GhostPCL (http://www.ghostscript.com/GhostPCL.html)