Best way to generate a custom document? - pdf

I am working on generating a document for printing. It should use a specific TTF font and everything must be printed with vector graphics (for quality). Some of the text should be replaced automatically (e.g. current time). Also it should include a custom-generated EPS image with a chart.
Ideally I would like to have some kind of document template where the text could be replaced easily, and it would be nice if it could import the image through path. But I am not sure which format could be good for this. Best I can come to think of is LaTeX, but I don't like that it's a lot of manual work to use it with TTF... any other ideas?
By the way, I am using OS X...

Memoir package is very flexible for your special layouts.
Xetex uses your system fonts (Installed together with TexLive).

You could blend most of those elements to an EPS using imagemagick or gimp script-fu

There are several products out there that will build you a PDF programmatically. I've only used the Coldfusion Report Builder myself and that may not be practical/affordable for your application. If your budget allows I'd look into a commercial reporting product. I know Adobe have several that will generate Flash, FlashPaper or PDF output.

Related

PDF - Auto adjust for mobile?

Does anyone know if there is a function in PDF's to allow them to auto-adjust the view depending on whether it is on a desktop or mobile? Or even by screen size?
I am looking to prepare PDF material for distribution, however, on the user group includes a mix of desktop and mobile, so instead of creating two PDFs I would like to have a single PDF which adapts to the users screen?
This is not possible with PDF files up to PDF version 1.7, the most commonly used on out there.
PDF 2.0 which was released three years ago has such a feature but it depends on the viewer implementing it and the PDF writer correctly annotating the PDF. I guess there are PDF viewers out there that can already do this but I'm not specifically aware of any.
If I were you, I would write the document in a format like LaTeX that can easily be converted to both kinds of PDFs, one for desktop and one for mobile.

A better file format than PDF or EPUB?

My client wants us to build a custom document viewer for their app. (It really, truly needs to be custom, because there are a ton of application-specific features they need.)
We built one for them last year that took PDFs, generated page images, and backed the images using a hidden layer of text that could be selected and copied. We did it in Flex. It was a nightmare. PDF is horrid.
This year, we need to build one in HTML 5 with similar requirements, except that most of the documents now are in Word or HTML, that is, they have reflowable text, instead of the fixed layout and glyphs of PDF. But they still want to do PDF in the same viewer.
I'm thinking that we need to convert all documents to some common file format that can handle both reflowable text and also the fixed-position glyphs of PDF. (Each document would probably support one or the other, but not both). It would be nice if it were an XML-like markup language that would say:
<text>here's some text</text>
-- or --
<glyph letter="a" name="my_a_glyph" position="10,10"/>
<image src="my_image" position="20,20"/>
or something like that.
Is there any existing file format out there that can handle it? EPUB won't do the fixed-position text, and PDF sucks in too many ways to describe.
I think you can look at FB2 (FictionBook 2) format . That is an XML-based format, designed for publishing books. It includes images, though I am not sure if they can be aligned absolutely.
Also, you can simply go with HTML and do HTML-to-PDF rendering when needed (there exists various components and libraries for this). I don't see (or you have not listed) any reasons why this way doesn't work.
GROFF? Maybe build a macro library to customize it, as needed.
Groff/troff/nroff, the "run off" programs of Unix, can output to postscript or HTML. The jump from postscript to PDF is built in to some PDF viewers; there are also several existing programs for it, pstopdf, for example.
GROFF has some fixed layout options and some flow-like options. With GROFF, it's almost easier to base most of the printout on flowing text, within proscribed bounds.

PDF Outline Text - Automation of Acrobat Sequences

I have built an application that automates the filling out of form fields inside a pdf. It then takes various assets and combines them together to generate a "print ready" product. All of this is accomplished using the magic of iTextSharp. When form fields are populated, they are then flattened to text. The problem is that even with the fonts embedded they aren't really attached to the form fields in a meaningful way (like straight text elements are) and the printers are complaining that the pdf is generating licensing errors due to this. I researched this a bit and it just seems to be the nature of how form fields are.
The artists we are working with requested that we research a way to "outline" the text that is created from flattening the form fields. I found that running the PDF Optimizer with a custom preset allows for Text Outlining in Acrobat, and even better I can generate an Acrobat Sequence that runs this command on the pdf. The problem is that Sequences can not be automated, at all.
I found a plug-in called AutoBatch that allows for the execution of Sequences on the command line through a batch file. The downside is that this would require installing Acrobat Pro and the Plug-in on the server this application will be running on. Further it seems like an overkill solution just to outline the text in the pdf. For all I know at this point iTextSharp may allow me to do this programmatic, but searching for such a thing on google returns little results and nothing relevant.
So the question: Is there a better way to outline text in a pdf than the current solution I have implemented or am I kind of stuck?
TLDR; PDF is generated w/ non-standard fonts. I need to "outline" this text to send it to the printer. Currently using AutoBatch Acrobat Plug-In to execute Acrobat Sequence from the Command Line. Seems excessive, wondering if anyone knows a better way to automate font outlining.
I am also in a printing environment and have used forms for "Box Covers" plenty of times to shorten the code used to produce box covers.
I simple us "pdfStamper.FormFlattening = true;" and the printers (Xerox DP180 and DC5000) has no problems in using the PDF.
The moment I leave out FormFlattening the printer gives a lot of errors regarding the PDF.
If you are using FormFlattening then check if the printer has the font locally installed in order for it to reference the font from the print engine instead of the PDF resources.

PDF Colo(u)r Analysis (without Acrobat itself ?)

Is there a library/tool which would list all colours used in a PDF document ?
I'm sure Acrobat itself would do this but I would like an alternative (ideally something that could be scripted).
So the idea is if you have a very simple PDF document with four colours in it the output might say :
RGB(100,0,0)
RGB(105,0,0)
CMYK(0,0,0,1)
CMYK(1,1,1,1)
You could explore the insides with pdfbox, but you would have to write some code to find and catalog all those colors.
Most PDF tools have access to this information but no api to access it. You could take any tool and add it in
Apago PDFspy generates an XML file containing all kinds of metadata extracted from PDF files. It reports color usage including spot colors.
We recently added a function called GetPageColorSpaces(0) to the Quick PDF Library - www.quickpdflibrary.com to retrieve much of the ColorSpace info used in the document.
Here is some sample output.
Resource,\"QuickPDFCS2eb0f578\",Separation,\"HKS 52 E\",DeviceCMYK,0.95,0,0.55,0
Resource,\"QuickPDFCSb7b05308\",Separation,\"Black\",DeviceCMYK,0,0,0,1
Resource,\"QuickPDFCSd9f10810\",Separation,\"Pantone 117 C\",DeviceCMYK,0,0.18,1,0.15
Resource,\"QuickPDFCS9314518c\",Separation,\"All\",DeviceCMYK,0,1,0,0.5
Resource,\"QuickPDFCS333d463d\",Separation,\"noplate\",DeviceCMYK,1,0,0,0
Resource,\"QuickPDFCSb41cafc4\",Separation,\"noprint\",DeviceCMYK,0,1,0,0
Resource,\"Cs10\",DeviceN,Black,Colorant,-1,-1,-1,-1
Resource,\"Cs10\",DeviceN,P1495,Colorant,-1,-1,-1,-1
Resource,\"Cs10\",DeviceN,CalRGB,Colorant,-1,-1,-1,-1
Resource,\"Cs10\",Separation,\"P1495\",DeviceCMYK,0,0.31,0.69,0
XObject,\"R29\",Image,,DeviceRGB,-1,-1,-1,-1
Disclaimer: I work at Atalasoft.
Our product, DotImage with the PDF Reader add-on, can do this. The easiest way is to rasterize the page and then just use any of our image analysis tools to get the colors.
This example shows how to do it if you want to group similar colors -- the deployed example will only work for PNG and JPEG, but if you download the code, it's trivial to include the add-on and get PDF as well (let me know if you need help)
Source here:
http://www.atalasoft.com/cs/blogs/31appsin31days/archive/2008/05/30/color-scheme-generator.aspx
Run it here:
http://www.atalasoft.com/31apps/ColorSchemeGenerator
If you are working with specific and simple PDF documents from a constrained source then you may be able to find the colors by reading through the content stream. However this cannot be a generic solution.
For example PDF documents can contain gradients or transparency. If your document contains this type of construct then you are likely to end up with a wide range of colors rather than a specific set.
Similarly many PDF documents contain bitmapped images. Given that these will need to be interpolated to be displayed at different resolutions, the set of colors in a displayed PDF may be bigger or different to (though obviously broadly similar to) the embedded bitmap.
Similarly many PDF documents contain constructs in multiple color spaces that are rendered into different color spaces. For example a PDF might contain a DeviceRGB bitmap, a line in an ICC based CMYK color and a Lab based rectangle. The displayed version might be in sRGB for display or CMYK for print. Each of these will influence the precise set of colors.
So the only 100% valid answer is going to be related to a particular render of a PDF at a particular resolution to a particular color space. From the resultant bitmap you can determine the colors that have been used.
There are a variety of PDF libraries that will do this type of render including DotImage (referenced in another answer) and ABCpdf .NET (on which I work).

Is there a way to use custom fonts in a PDF file?

Well basically I'm finishing school in mid December so I'm just brushing up my resume and I'm wondering if there's a way to use custom fonts (in this case Calibri and Cambria) in a PDF file and make them render correctly on all computers.
Thanks in advance!
EDIT: I'm using MS Word 2007, but am open to suggestions
PDFs don't store text and fonts like other documents, they actually convert the font to vectors, that way no matter what font you use, the document displays exactly as expected. This is why searching for text inside the PDF is such a problem for 3rd party PDF Readers and why even Adobe themselves use to distribute 2 versions of Acrobat (one with text search, one without).
Another thing to keep in mind is, PDF isn't pixel exact, it's ratio exact. PDF readers generally do not use a 100% zoom level, instead most people read them at "fit to screen" or "fit to page". I point this out because I'm guessing the reason you are trying to use those new Vista/Office 2007 fonts is because of their LCD subpixel support (improves readability on LCD screens). This feature will not translate into the PDF, since the letter becomes a vector, subpixel information is lost, and even if it wasn't, becomes useless because the vector will be sized to something other than you intended at view time.
The PDF format is capable of embedding fonts, if the font has been marked embeddable by its creator. You'll have to check the software that's creating your PDF to see if it has the capability and how to enable it.
theoretically speaking, on technical side, embedding/not embedding ability, regarding the fonts, is settled with a special flag in font file (ttf or opentype or type1)
you can view this special embedding flag with any font editor program (I recommend
FontCreator (by High-logic)
http://www.high-logic.com/font-editor/fontcreator.html
with a free trial fully operative and without limitations
you can also change embedding/not embedding flag, but legally speaking, for the 99% of fonts commercially distributed, this breaks the license of font