I've been looking left, right and center but was not able to find a definitive answer as to how and whether it is possible to increase document size in LaTeX over the magic 200inch limit.
For a project I work on I need to be able to dynamically generate PDF/EPS that can be used to print on up to 32 yards long medium. The important factor is, that this print job must be seamless and contains color stripes, symbols and texts on repeat throughout the entire length.
I know that PDF 1.6+ supports a much larger document size via setting UserUnits. Is that something that can be used in LaTeX and if so, how would I go about this?
I am writing a library to generate PDF reports using prawn reports.
One of the features I wish to my gem is the ability to provide means of testing the generation of reports.
The problem is that two visually equal PDFs can have different files.
Is there a way to make sure that 2 visually equal PDF have the same bits in the file? Something like XML canonicalization.
'Visual equality' (or visual similarity': where only a small percentage of pixels is different for each page) of 2 different PDFs can occur even if the internal structure of PDF objects is very different. (Think of a page of 'text', which may use real fonts or which may use 'outline' vector graphics for each glyph's shape...)
That means this equality can only be determined by rendering the two files at the same resolution to page images and then comparing both image sets pixel by pixel. The result of the comparison could be another pixel image that shows all differing pixels as red, or, at your preference, just the number of pixels which do not agree.
A scriptable way to do this with the help of ghostscript, pdftk and ImageMagick I've described in this answer:
How to unit test a Python function that draws PDF graphics?
Alternatively, you may have a look at
diffpdf
(which is available for Linux, Unix, Mac OS X and Windows): it also can compare two PDF files visually.
[ Your literal question was this: "Is there a way to make sure that 2 visually equal PDF have the same bits in the file?" -- However, I'm not sure if you really meant it that way -- hence my above answer. Otherwise I'd have to say: If two PDF files are visually equal, just generate their respective MD5sum to determine if they have the same bits in each file... ]
I am trying to use true type fonts, but finding that when I load different ones, some do not display at various font sizes while others do. It seems, although I am not certain, that some simply do not show below a certain font size. Is there any way to easily tell what sizes are valid for a ttf by examining the file and not simply trying it out in my app?
UPDATE: Considering smilingthax's answer, it would seem like these must be bitmap glyphs since the smaller sizes are not showing up (but using some system font instead I think). Does someone know how I can determine the valid sizes I can use with them using the iOS sdk (iPad)?
TTFs can contain Bitmap glyphs and/or Outline glyphs. Outline glyphs are scalable to any size (e.g. "31.4592 pt") although they are often optimized (hinted) for certain sizes (8,10,12,16,....72). You can get these sizes via your Font API (Win32, Freetype, ...). Bitmap glyphs are only available in certain sizes.
For example Freetype's FTFace object contains a .face_flags member which can have FT_FACE_FLAG_FIXED_SIZES set. Also the sizes are accessible via .num_fixed_sizes and the .available_sizes array.
Is there a library/tool which would list all colours used in a PDF document ?
I'm sure Acrobat itself would do this but I would like an alternative (ideally something that could be scripted).
So the idea is if you have a very simple PDF document with four colours in it the output might say :
RGB(100,0,0)
RGB(105,0,0)
CMYK(0,0,0,1)
CMYK(1,1,1,1)
You could explore the insides with pdfbox, but you would have to write some code to find and catalog all those colors.
Most PDF tools have access to this information but no api to access it. You could take any tool and add it in
Apago PDFspy generates an XML file containing all kinds of metadata extracted from PDF files. It reports color usage including spot colors.
We recently added a function called GetPageColorSpaces(0) to the Quick PDF Library - www.quickpdflibrary.com to retrieve much of the ColorSpace info used in the document.
Here is some sample output.
Resource,\"QuickPDFCS2eb0f578\",Separation,\"HKS 52 E\",DeviceCMYK,0.95,0,0.55,0
Resource,\"QuickPDFCSb7b05308\",Separation,\"Black\",DeviceCMYK,0,0,0,1
Resource,\"QuickPDFCSd9f10810\",Separation,\"Pantone 117 C\",DeviceCMYK,0,0.18,1,0.15
Resource,\"QuickPDFCS9314518c\",Separation,\"All\",DeviceCMYK,0,1,0,0.5
Resource,\"QuickPDFCS333d463d\",Separation,\"noplate\",DeviceCMYK,1,0,0,0
Resource,\"QuickPDFCSb41cafc4\",Separation,\"noprint\",DeviceCMYK,0,1,0,0
Resource,\"Cs10\",DeviceN,Black,Colorant,-1,-1,-1,-1
Resource,\"Cs10\",DeviceN,P1495,Colorant,-1,-1,-1,-1
Resource,\"Cs10\",DeviceN,CalRGB,Colorant,-1,-1,-1,-1
Resource,\"Cs10\",Separation,\"P1495\",DeviceCMYK,0,0.31,0.69,0
XObject,\"R29\",Image,,DeviceRGB,-1,-1,-1,-1
Disclaimer: I work at Atalasoft.
Our product, DotImage with the PDF Reader add-on, can do this. The easiest way is to rasterize the page and then just use any of our image analysis tools to get the colors.
This example shows how to do it if you want to group similar colors -- the deployed example will only work for PNG and JPEG, but if you download the code, it's trivial to include the add-on and get PDF as well (let me know if you need help)
Source here:
http://www.atalasoft.com/cs/blogs/31appsin31days/archive/2008/05/30/color-scheme-generator.aspx
Run it here:
http://www.atalasoft.com/31apps/ColorSchemeGenerator
If you are working with specific and simple PDF documents from a constrained source then you may be able to find the colors by reading through the content stream. However this cannot be a generic solution.
For example PDF documents can contain gradients or transparency. If your document contains this type of construct then you are likely to end up with a wide range of colors rather than a specific set.
Similarly many PDF documents contain bitmapped images. Given that these will need to be interpolated to be displayed at different resolutions, the set of colors in a displayed PDF may be bigger or different to (though obviously broadly similar to) the embedded bitmap.
Similarly many PDF documents contain constructs in multiple color spaces that are rendered into different color spaces. For example a PDF might contain a DeviceRGB bitmap, a line in an ICC based CMYK color and a Lab based rectangle. The displayed version might be in sRGB for display or CMYK for print. Each of these will influence the precise set of colors.
So the only 100% valid answer is going to be related to a particular render of a PDF at a particular resolution to a particular color space. From the resultant bitmap you can determine the colors that have been used.
There are a variety of PDF libraries that will do this type of render including DotImage (referenced in another answer) and ABCpdf .NET (on which I work).
I am working on generating a document for printing. It should use a specific TTF font and everything must be printed with vector graphics (for quality). Some of the text should be replaced automatically (e.g. current time). Also it should include a custom-generated EPS image with a chart.
Ideally I would like to have some kind of document template where the text could be replaced easily, and it would be nice if it could import the image through path. But I am not sure which format could be good for this. Best I can come to think of is LaTeX, but I don't like that it's a lot of manual work to use it with TTF... any other ideas?
By the way, I am using OS X...
Memoir package is very flexible for your special layouts.
Xetex uses your system fonts (Installed together with TexLive).
You could blend most of those elements to an EPS using imagemagick or gimp script-fu
There are several products out there that will build you a PDF programmatically. I've only used the Coldfusion Report Builder myself and that may not be practical/affordable for your application. If your budget allows I'd look into a commercial reporting product. I know Adobe have several that will generate Flash, FlashPaper or PDF output.