Render the alpha channel into an RGBA .jpg file with 3ds Max and maxscript - scripting

I have simple script where I render an object with the following command:
Render camera: $Camera001 outputfile: "test.jpg" outputsize:[1920,1080] vfb:off outputHDRbitmap: true
However, the output file is 24 Bit RGB, I wonder if there is a possibility to render RGBA with the same command or another method with the some simplicity?
I could only find solutions with bitmaps that went over my head.

I would suggest to use PNG with alpha instead of JPG.
The command is totally correct, however as a preparation step you may also need to access PNG settings with maxscript, as you normally do with this PNG configuration dialog:
pngio.setType #true48
pngio.setAlpha true
pngio.setInterlaced false
Complete documentation for pngio you may find here: https://help.autodesk.com/...F08AA370DEB8_htm

Related

How to add footer to pdf with pdfjam or pdftk?

I am using a shell script to modify many pdfs and would like to create a script that adds the page number (1 of X format) to the bottom of PDFs in a directory along with the text of the filename.
I tried using pdfjam with this format:
pdfjam --pagenumbering true
but it fails saying undefine pagenumbering
Any other recommendations how to do this? I am OK installing other tools but would like this to all be within a shell script.
Thank you
tl;dr: pdfjam --pagecommand '' input.pdf
By default, pdfjam adds the following LaTeX command to every page: \thispagestyle{empty}. By changing the command to an empty command, the default plain page style is used, which consists of a page number at the bottom. Of course you may want to play with other styles or layout options to position the page number differently.

Change resolution of converted images using Sphinx ImageMagick extension sphinx.ext.imgconverter

We are using Python-Sphinx to build our end user manuals.
In order to automatically convert our assorted graphic file formats like we are using the Sphinx Extension sphinx.ext.imgconverter, which utilizes ImageMagick to convert our graphic file formats to graphic formats, which the given build target can understand.
For detailed information see: sphinx.ext.imgconverter
Unfortunately, the output of the converted images does not satisfy our needs. A main issue is the low resolution of the converted images, which gives pixelated outcomes.
Therefore I included the following line to my conf.py:
image_converter_args=["-density 300"]
Now, the build process fails and leaves me with the following error message.
Extension error:
convert exited with error:
[stderr]
magick.exe: unrecognized option `-density 300' at CLI arg 1 # fatal/magick-cli.c/ProcessCommandOptions/428.
Can someone please help me?
The arguments should be a list.
image_converter_args=["-density", "300"]
Where "-density" is the operator argument, and "300" is the value argument.

Crop PDF Content

I have a pdf that I would like to impose. It has 8.5x11" pages, media box, and crop box. I want the pdf to have 17x11" pages, by merging adjacent pages. Unfortunately, most pages have content either completely outside or straddling the crop box. Because each page can only have a single stream and crop box, when imposed, the overlapping content becomes visible. This is bad.
I don't want to rasterize my pdf because that would fix the DPI ahead-of-time. So I won't consider exporting pages as images, appending the images (imagemagick), then embedding these paired images into a new pdf.
I've also had problems imposing in postscript - issues with transparency, font rasterization, and other visual glitches during the pdf->ps->pdf conversions.
The answer should be scriptable.
So far I've tried:
podofo imposition scripts (lua)
PyPDF2 (python)
ghostscript
latex
The question "Ghostscript removes content outside the crop box?" suggests that ghostscript's pdfwrite module, when generating an output pdf file, will rasterize and crop content according to the crop box. So I'd only have to pipe my pdf through ghostscript's pdfwrite module. Unfortunately, this doesn't work.
I was about to give up when I tried printing the pdf to another pdf through evince. It works perfectly - text & vector elements within the crop box are not rasterized, and elements outside the crop box are removed (I haven't tested straddling elements yet). The quality is high - resolution (page size) and appearance are identical. In fact, everything seems to be the same except for the metadata.
So:
the question is possible
the answer already exists
How can I access it?
I think this functionality might be provided by cup's pdftopdf binary. I don't have any problems calling an external binary.... but can't figure out how to use pdftopdf.
Edit: Link to test pdf. It contains raster, vector, and text items - some partially occluded by partially transparent items - that span as well as abut adjacent pages. Once again, printing this PDF through cups appears to crop all content outside the crop box. However, opening the filtered pdf in inkscape shows that the off-page items are individually masked, not cropped - except text, which is trimmed.
The trick is to use Form XObjects to impose multiple pages within a single page. Form XObjects can reference entire PDF pages, and maintain independent clips. PyPDF2 doesn't support Form XObjects, so merging unifies the stream of all input pages such that they share the clip/media box of the output page. I've been successful in using both pdflatex and pdfrw (python) - test programs are inlined below. Since Form XObjects are derived from a similar postscript level 2 feature, as suggested by KenS it should be possible to achieve the same goal in ghostscript using "page clips". In fact he shared a ghostscript 2x1 imposition script in another answer, but it appears horrendously complicated. Combined with the font rasterization issues of poppler's pdftops (even with compatibility level > 1.4), I've abandoned the ghostscript approach.
Latex script derived from How to stitch two PDF pages together as one big page?. Requires pdflatex:
\documentclass{article}
\usepackage{pdfpages}
\usepackage[paperwidth=8.5in, paperheight=11in]{geometry}
\usepackage[multidot]{grffile}
\pagestyle{plain}
\begin{document}
\setlength\voffset{+0.0in}
\setlength\hoffset{+0.0in}
\includepdf[ noautoscale=true
, frame=false
, pages={1}
]
{<file.pdf>}
\eject \paperwidth=17in \pdfpagewidth=17in \paperheight=11in \pdfpageheight=11in
\includepdf[ nup=2x1
, noautoscale=true
, frame=false
, pages={2-,}
]
{<file.pdf>}
\end{document}
pdfrw (python script) derived from pdfrw:examples:booklet. Requires pdfrw >= 0.2:
#!/usr/bin/env python3
# Copyright:
# Yclept Nemo
# 2016
# License:
# GPLv3
import itertools
import argparse
import pdfrw
# from itertool recipes in the python documentation
def grouper(iterable, n, fillvalue=None):
"Collect data into fixed-length chunks or blocks"
# grouper('ABCDEFG', 3, 'x') --> ABC DEF Gxx"
args = [iter(iterable)] * n
return itertools.zip_longest(*args, fillvalue=fillvalue)
def pagemerge(page, *pages):
merged = pdfrw.PageMerge() + page
for page in reversed(list(itertools.takewhile(lambda i: i is not None, reversed(pages)))):
merged = merged + page
merged[-1].x = merged[-2].x + merged[-2].w
return merged.render()
parser = argparse.ArgumentParser(description='Impose PDF files using Form XOBjects')
parser.add_argument\
( "source"
, help="PDF, source path"
, type=pdfrw.PdfReader
)
parser.add_argument\
( "-s", "--spacer"
, help="PDF, spacer path"
, type=lambda fp: next(iter(pdfrw.PdfReader(fp).pages), None)
)
parser.add_argument\
( "target"
, help="PDF, target path"
)
args = parser.parse_args()
pages = args.source.pages[:1]
for pair in grouper(args.source.pages[1:], 2):
assert pair[0] is not None
pages.append(pagemerge(pair[0], args.spacer, pair[1]))
# include metadata in target
target = pdfrw.PdfWriter()
target.addpages(pages)
target.trailer.Info = args.source.Info
target.write(args.target)
Some idiosyncrasies as of pdfrw 0.2:
Note that the operations +=, append and extend are not defined for pdfrw.PageMerge, even though it behaves like a list. Furthermore + acts like += in that it modifies the left-hand-side object.
Ghostscript and the pdfwrite device do not, in general, rasterise the content of input PDF files (the caveat is for cases involving transparent input and the output being < PDF 1.4).
Object which are entirely clipped out are not preserved into the output.
So the short answer is that this should be entirely feasible using Ghostscript and the pdfwrite device, with the advantage that its possible to impose the pages as well in a single operation. I do have an open bug report about clipping in a similar situation (reverse imposition) but have not yet had time to address it.
Note that Ghostscript normally uses the MediaBox for the clip region, if you want to use the CropBox then you need to add -dUseCropBox to the command line.

Convert PDF text into outlines?

Does anybody know a way to vectorize the text in a PDF document? That is, I want each letter to be a shape/outline, without any textual content. I'm using a Linux system, and open source or a non-Windows solution would be preferred.
The context: I'm trying to edit some old PDFs, for which I no longer have the fonts. I'd like to do it in Inkscape, but that will replace all the fonts with generic ones, and that's barely readable. I've also been converting back and forth using pdf2ps and ps2pdf, but the font info stays there. So when I load it into Inkscape, it still looks awful.
Any ideas? Thanks.
To achieve this, you will have to:
Split your PDF into individual pages;
Convert your PDF pages into SVG;
Edit the pages you want
Reassemble the pages
This answer will omit step 3, since that's not programmable.
Splitting the PDF
If you don't want a programmatic way to split documents, the modern way would be with using stapler. In your favorite shell:
stapler burst file.pdf
Would generate {file_1.pdf,...,file_N.pdf}, where 1...N are the PDF pages. Stapler itself uses PyPDF2 and the code for splitting a PDF file is not that complex. The following function splits a file and saves the individual pages in the current directory. (shamelessly copying from the commands.py file)
import math
import os
from PyPDF2 import PdfFileWriter, PdfFileReader
def split(filename):
with open(filename) as inputfp:
inputpdf = PdfFileReader(inputfp)
base, ext = os.path.splitext(os.path.basename(filename))
# Prefix the output template with zeros so that ordering is preserved
# (page 10 after page 09)
output_template = ''.join([
base,
'_',
'%0',
str(math.ceil(math.log10(inputpdf.getNumPages()))),
'd',
ext
])
for page in range(inputpdf.getNumPages()):
outputpdf = PdfFileWriter()
outputpdf.addPage(inputpdf.getPage(page))
outputname = output_template % (page + 1)
with open(outputname, 'wb') as fp:
outputpdf.write(fp)
Converting the individual pages to SVG
Now to convert the PDFs to editable files, I'd probably use pdf2svg.
pdf2svg input.pdf output.svg
If we take a look at the pdf2svg.c file, we can see that the code in principle is not that complex (assuming the input filename is in the filename variable and the output file name is in the outputname variable). A minimal working example in python follows. It requires the pycairo and pypoppler libraries:
import os
import cairo
import poppler
def convert(inputname, outputname):
# Convert the input file name to an URI to please poppler
uri = 'file://' + os.path.abspath(inputname)
pdffile = poppler.document_new_from_file(uri, None)
# We only have one page, since we split prior to converting. Get the page
page = pdffile.get_page(0)
# Get the page dimensions
width, height = page.get_size()
# Open the SVG file to write on
surface = cairo.SVGSurface(outputname, width, height)
context = cairo.Context(surface)
# Now we finally can render the PDF to SVG
page.render_for_printing(context)
context.show_page()
At this point you should have an SVG in which all text has been converted to paths, and will be able to edit with Inkscape without rendering issues.
Combining steps 1 and 2
You can call pdf2svg in a for loop to do that. But you would need to know the number of pages beforehand. The code below figures the number of pages and does the conversion in a single step. It requires only pycairo and pypoppler:
import os, math
import cairo
import poppler
def convert(inputname, base=None):
'''Converts a multi-page PDF to multiple SVG files.
:param inputname: Name of the PDF to be converted
:param base: Base name for the SVG files (optional)
'''
if base is None:
base, ext = os.path.splitext(os.path.basename(inputname))
# Convert the input file name to an URI to please poppler
uri = 'file://' + os.path.abspath(inputname)
pdffile = poppler.document_new_from_file(uri, None)
pages = pdffile.get_n_pages()
# Prefix the output template with zeros so that ordering is preserved
# (page 10 after page 09)
output_template = ''.join([
base,
'_',
'%0',
str(math.ceil(math.log10(pages))),
'd',
'.svg'
])
# Iterate over all pages
for nthpage in range(pages):
page = pdffile.get_page(nthpage)
# Output file name based on template
outputname = output_template % (nthpage + 1)
# Get the page dimensions
width, height = page.get_size()
# Open the SVG file to write on
surface = cairo.SVGSurface(outputname, width, height)
context = cairo.Context(surface)
# Now we finally can render the PDF to SVG
page.render_for_printing(context)
context.show_page()
# Free some memory
surface.finish()
Assembling the SVGs into a single PDF
To reassemble you can use the pair inkscape / stapler to convert the files manually. But it is not hard to write code that does this. The code below uses rsvg and cairo. To convert from SVG and merge everything into a single PDF:
import rsvg
import cairo
def convert_merge(inputfiles, outputname):
# We have to create a PDF surface and inform a size. The size is
# irrelevant, though, as we will define the sizes of each page
# individually.
outputsurface = cairo.PDFSurface(outputname, 1, 1)
outputcontext = cairo.Context(outputsurface)
for inputfile in inputfiles:
# Open the SVG
svg = rsvg.Handle(file=inputfile)
# Set the size of the page itself
outputsurface.set_size(svg.props.width, svg.props.height)
# Draw on the PDF
svg.render_cairo(outputcontext)
# Finish the page and start a new one
outputcontext.show_page()
# Free some memory
outputsurface.finish()
PS: It should be possible to use the command pdftocairo, but it doesn't seem to call render_for_printing(), which makes the output SVG maintain the font information.
I'm afraid to vectorize the PDFs you would still need the original fonts (or a lot of work).
Some possibilities that come to mind:
dump the uncompressed PDF with pdftk and discover what the font names are, then look for them on FontMonster or other font service.
use some online font recognition service to get a close match with your font, in order to preserve kerning (I guess kerning and alignment are what's making your text unreadable)
try replacing the fonts manually (again pdftk to convert the PDF to a PDF which is editable with sed. This editing will break the PDF, but pdftk will then be able to recompress the damaged PDF to a useable one).
Here's what you really want - font substitution. You want some code/app to be able to go through the file and make appropriate changes to the embedded fonts.
This task is doable and is anywhere from easy to non-trivial. It's easy when you have a font that matches the metrics of the font in the file and the encoding used for the font is sane. You could probably do this with iText or DotPdf (the latter is not free beyond the evaluation, and is my company's product). If you modified pdf2ps, you could probably manage changing the fonts on the way through too.
If the fonts used in the file are font subsets that have creative reencoding, then you are in hell and will likely have all manner of pain doing the change. Here's why:
PostScript was designed at a point when there was no Unicode. Adobe used a single byte for characters and whenever you rendered any string, the glyph to draw was taken from a 256 entry table called the encoding vector. If a standard encoding didn't have what you wanted, you were encouraged to make fonts on the fly based on the standard font that differed only in encoding.
When Adobe created Acrobat, they wanted to make transition from PostScript as easy as possible so that font mechanism was modeled. When the ability to embed fonts into PDFs was added, it was clear that this would bloat the files, so PDF also included the ability to have font subsets. Font subsets are made by taking an existing font and removing all the glyphs that won't be used and re-encoding it into the PDF. The may be no standard relationship between the encoding vector and the code points in the file - all those may be changed. Instead, there may be an embedded PostScript function /ToUnicode which will translate encoded characters to a Unicode representation.
So yeah, non-trivial.
For the folks who come after me:
The best solutions I found were to use Evince to print as SVG, or to use the pdf2svg program that's accessible via Synaptic on Mint. However, Inkscape wasn't able to cope with the resulting SVGs--it entered an infinite loop with the error message:
File display/nr-arena-item.cpp line 323 (?): Assertion item->state & NR_ARENA_ITEM_STATE_BBOX failed
I'm giving up this quest for now, but maybe I'll try again in a year or two. In the meantime, maybe one of these solutions will work for you.

Rendering the whole media box of a pdf page into a png file using ghostscript

I'm trying to render Pdfs pages into png files using Ghostscript v9.02. For that purpose I'm using the following command line:
gswin32c.exe -sDEVICE=png16m -o outputFile%d.png mypdf.pdf
This is working fine when the pdf crop box is the same as the media box, but if the crop box is smaller than the media box, only the media box is displayed and the border of the pdf page is lost.
I know usually pdf viewers only display the crop box but I need to be able to see the whole media page in my png file.
Ghostscript documentation says that per default the media box of a document is rendered, but this does not work in my case.
As anyone an idea how I could achieve rendering the whole media box using ghostscript?Could it be that for png file device, only the crop box is rendered? Am I maybe forgetting a specific command?
For example, this pdf contains some registration marks outside of the crop box, which are not present in the output png file. Some more information about this pdf:
media box:
width: 667
height: 908 pts
crop box:
width: 640
height: 851
OK, now that revers has re-stated his problem into that he is looking for "generic code", let me try again.
The problem with a "generic code" is that there are many "legal" formal representations of "CropBox" statements which could appear in a PDF. All of the following are possible and correct and set the same values for the page's CropBox:
/CropBox[10 20 500 700]
/CropBox[ 10 20 500 700 ]
/CropBox[10 20 500 700 ]
/CropBox [10 20 500 700]
/CropBox [ 10 20 500 700 ]
/CropBox [ 10.00 20.0000 500.0 700 ]
/CropBox [
10
20
500
700
]
The same is true for ArtBox, TrimBox, BleedBox, CropBox and MediaBox. Therefor you need to "normalize" the *Box representation inside the PDF source code if you want to edit it.
First Step: "Normalize" the PDF source code
Here is how you do that:
Download qpdf for your OS platform.
Run this command on your input PDF:
qpdf --qdf input.pdf output.pdf
The output.pdf now will have a kind of normalized structure (similar to the last example given above), and it will be easier to edit, even with a stream editor like sed.
Second Step: Remove all superfluous *Box statements
Next, you need to know that the only essential *Box is MediaBox. This one MUST be present, the others are optional (in a certain prioritized way). If the others are missing, they default to the same values as MediaBox. Therefor, in order to achieve your goal, we can simply delete all code that is related to them. We'll do it with the help of sed.
That tool is normally installed on all Linux systems -- on Windows download and install it from gnuwin32.sf.net. (Don't forget to install the named "dependencies" should you decide to use the .zip file instead of the Setup .exe).
Now run this command:
sed.exe -i.bak -e "/CropBox/,/]/s#.# #g" output.pdf
Here is what this command is supposed to do:
-i.bak tells sed to edit the original file inline, but to also create a backup file with a.bak suffix (in case something goes wrong).
/CropBox/ states the first address line to be processed by sed.
/]/ states the last address line to be processed by sed.
s tells sed to do substitutions for all lines from first to last addressed line.
#.# #g tells sed which kind of substitution to do: replace each arbitrary character ('.') in the address space by blanks (''), globally ('g').
We substitute all characters by blanks (instead of by 'nothing', i.e. deleting them) because otherwise we'd get complaints about "PDF file corruption", since the object reference counting and the stream lengths would have changed.
Third step: run your Ghostscript command
You know that already well enough:
gswin32c.exe -sDEVICE=png16m -o outputImage_%03d.png output.pdf
All the three steps from above can easily be scripted, which I'll leave to you for your own pleasure.
First, let's get rid of a misunderstanding. You wrote:
"This is working fine when the pdf crop box is the same as the media box, but if the crop box is smaller than the media box, only the media box is displayed and the border of the pdf page is lost."
That's not correct. If the CropBox is smaller than the MediaBox, then only the CropBox should be displayed (not the MediaBox). And that is exactly how it was designed to work. This is the whole idea behind the CropBox concept...
At the moment I cannot think of a solution that works automatically for each PDF and all possibly values that can be there (unless you want to use payware).
To manually process the PDF you linked to:
Open the PDF in a good text editor (one that doesn't mess with existing EOL conventions, and doesn't complain about binary parts in the file).
Search for all spots in the file that contain the /CropBox keyword.
Since you have only one page in the PDF, it should find only one spot.
This could read like /CropBox [12.3456 78.9012 345.67 890.123456].
Now edit this part, carefully avoiding to add to (or lose from) the number of already existing characters:
Set the value to your wanted one: /CropBox [0.00000 0.00000 667.00 908.000000]. (You can use spaces instead of my .0000.. parts, but if I do, the SO editor will eat them and you'll not see what I originally typed...)
Save the file under a new name.
A PDF viewer should now show the full MediaBox (as of your specification).
When you convert the new file with Ghostscript to PNG, the bigger page will be visible.