Printing PDF differs on different printers - pdf

I created A4 PDF document using iTextSharp. It contains simple table of which every cell in fact is label on the page. I drew it by using PdfContentByte, so the size of the programmatically drawing area is 595 x 842 points. Thus I drew rectangles (as table cells) using points as units.
I fixed the size and place (in points) of printing content by checking printed pages on my printer. I used Acrobat Reader and options: without scaling ('None') and default paper size 8.3' x 11.7'
Now if I print the same PDF on different printer the content (table) is shifted to the left or/and top direction. So the distances between page's edges and the outer frame of the table are different on different printers. Sometimes the content is cut but I know that different printers have different printing area - so it understood.
But I can not understand why it is shifted. Are they others parameters that I don't know?

Related

iText7 - create PDF with exact dimensions when printed - how?

I'm creating a simple PDF using iText7 (C#) but I need it to be printed at exactly the right size. Here's my code:
PdfWriter writer = new PdfWriter("output.pdf");
PdfDocument pdf = new PdfDocument(writer);
pdf.SetDefaultPageSize(iText.Kernel.Geom.PageSize.LETTER);
var page = pdf.AddNewPage();
page.SetCropBox(new iText.Kernel.Geom.Rectangle(36, 36, 7.5f * 72, 10 * 72));
PdfCanvas canvas = new PdfCanvas(page);
canvas.SetStrokeColor(ColorConstants.BLACK).SetLineWidth(3);
canvas.MoveTo(36, 36);
canvas.LineTo(36, 36 + 72); // Draw a line 1 inch long
canvas.LineTo(36 + 72, 36 + 72); // Draw a second line, perpendicular to the first, also 1 inch long
canvas.ClosePathStroke();
pdf.Close();
If I right-click the resulting PDF and select "Print", my triangle is off the bottom of the page.
When I open the resulting PDF in the PDF program I'm using (PDF Architect), it gives me a few options:
If I just click "Print", it gives me lines that are 1 1/16" long and start about 1/8" from the edge of the page, so by default PDF Architect seems to be taking the contents of my crop box and expanding it to the maximum page availability.
If I click on "Fit" before clicking "Print", that results in the desired output - lines 1" long, starting 1/2" from each side of the page. That works but is error-prone - too easy to forget to click "Fit" every time.
Is there a way to generate a PDF that contains information that says "I'm targeting this document at letter size, but I'm staying 1/2" away from all the edges, so when you print, if the printer has margins <= 1/2 inch you should be fine, and just print it exactly how I've described without any shrinking or enlarging"?
You will not be able to completely control this from the PDF document. The PDF processor (e.g. viewing application) or the printer (driver) will always be able to scale the content up or down.
Apparently, PDF Architect has the "Fit" option enabled by default, so it scales the page to the selected paper size.
You are setting a crop box of 7.5x10 in. I assume you're printing to Letter sized (8.5x11 in) paper. So the 7.5x10 page will indeed be scaled up, and your content will become slightly larger.
Is there a way to generate a PDF that contains information that says "I'm targeting this document at letter size, but I'm staying 1/2" away from all the edges, so when you print, if the printer has margins <= 1/2 inch you should be fine, and just print it exactly how I've described without any shrinking or enlarging"?
I would not set the crop box. When the pages in the PDF document are Letter size and the output paper is also Letter size, it should not matter whether the "Fit" option is enabled or not, as not scaling needs to happen. It's definitely not a fool proof solution, but at least it's less error prone.

How to make long text fit into a text_frame? Python-pptx

I'm working with python-ppt to create a portfolio of candidates in a Powerpoint presentation. There is one candidate per slide and each of them has provided information about themselves like name, contacts and a minibio (the problem I'm here to solve)
The text_frame, created with values of height and width, must fit the slide but must a contain all lenght of minibios, which is not happening.
In a long phase (>200 char, with font size 12) it exceeds the size of the text box and get "out" of the slide, so, in presentation mode or a PDF file, the "overrun" of text is lost
Is there any way to confine the text to the shape/size of the text_frame? (extra help if the solution wont change font size)
Just found one parameter that helped to find the answer
When creating a text_box object with slides.shapes.add_textbox() and adding a text_frame to it, the text_frame.word_wrap = True limits the text to be contained inside the dimentions of the text_box
The code shows it better
# creates text box with add_textbox(left, top, width, height)
txBox = slide.shapes.add_textbox(Cm(16),Cm(5),Cm(17),Cm(13))
tf = txBox.text_frame
tf.word_wrap = True
Before word_wrap parameter
After word_wrap parameter
The short answer is "No". PowerPoint is a page-layout environment, and much like the front page of a newspaper, text "story" content needs to be trimmed to fit the allotted space.
We're perhaps not used to this because word-processing, spreadsheet, and web-page content is "flowed" into a (practically) unlimited space, but the area of a PowerPoint slide is quite finite. Also, using it for large text blocks is somewhat of an off-label use. There is a certain amount of flexibility provided by reducing the font size, but not as much as one might expect. Even accommodating 20% additional text requires what appears as a pretty radical change in font size.
I've encountered this problem again and again, and the only solution I have ever seen work reliably is hand-curating the content to fit.
python-pptx has one experimental feature to address this but its operation has never been very satisfactory and it's tricky to get working. https://python-pptx.readthedocs.io/en/latest/api/text.html#pptx.text.text.TextFrame.fit_text
The business of fitting text is the role of a rendering engine, which python-pptx is not.

How do I use iTextSharp (or iText) to crop and copy a page from one PDF to another

I've written code to do the following:
Take a PDF of a certain page size (e.g., 8.5" x 11")
Create a new PDF with a larger page size (e.g., 17" x 11")
Impose the original PDF onto the new one (e.g., 2-up such that the resulting new PDF has the original PDF side-by-side)
To do this, I use the PdfWriter.GetImportedPage method to get the current page from the original PDF, then use the PdfContentByte.AddTemplate(page, x, y) method to place the original page onto the current page of the new PDF.
My new challenge is that I need to crop the original PDF before adding it to the new PDF. For example, imagine I want to crop 2" off of the original PDF before imposing it. The input PDF would still be 8.5" x 11" and the new PDF would still be 17" x 11", but the two "copies" of the original PDF in the new one would have had 2" removed from its top, right, bottom and left sides.
Hopefully these images can make this clearer. Here's what I have now, doing a 2-up imposition. (This is working swimmingly.)
But here's what I need to do:
I know that I can alter the display of the PDF in a viewer by using the MediaBox or CropBox settings, but those settings aren't respected by AddTemplate. I know that with AddTemplate I can use a transform matrix to position the page or to scale or rotate it, but I don't want to shrink the original PDF, I want to crop it.
Thanks
I found that I can use the BoundingBox of the imported page to crop it prior to adding it to the new PDF (via AddTemplate).
So my code looks something like this:
PdfImportedPage page = writer.GetImportedPage(pageNumber);
// Crop!
page.BoundingBox = new Rectangle(llx, lly, urx, ury);
// Add to new PDF
writer.DirectContent.AddTemplate(page, x, y);
That does the trick!

How is this pdf encoded? The font looks funny

I have seen this effect many times while reading pdf documents. So, some pdf have this funny smudged font which looks like it is a scanned image. However, I am able to select the font, and while selecting it the highlighted font appears differently as seen in the images.
Default appearance
Appearance on selection of font
Overall, seems like some ocr is happening behind the scene.
The document reader I am using is Atril 1.12.2 document viewer.
My question is: What is encoded in the pdf, image or text? What is happening to text when I am selecting it?
Another nice change can be observed in the document shared by the OP:
What we see here indeed is the result of OCR. But it's not some ocr happening behind the scene in the viewer, OCR has already happened before and the results have been integrated into the PDF.
The PDF page actually contains a scanned image upon which invisible text is drawn.
As long as nothing is selected, Atril shows exactly that, you only see the scanned image. As soon as you start selecting text, though, it appears to cover the marked area in blue and display the marked (formerly invisible) text in white upon it.
In situations, therefore, in which the invisible text is not added exactly above the corresponding letters in the image, this might result in funny gaps like the one in the OP's screenshot after "multidimensional". In case of errors in the OCR output, one sees the erroneous data like in my screenshots.
Other PDF viewer often merely mark the text by applying some effect to the text area, e.g. inverting colors or overlaying a semi-transparent color.
It might be considered an advantage of the Atril approach that already in the selection process one sees the exact text one is selecting and probably eventually going to copy.
Inside the content stream
As mentioned above, the PDF page actually contains a scanned image upon which invisible text is drawn.
In the page content stream the corresponding instructions look like this:
1 0 0 1 0 0.2401 cm
(shift the coordinate system a minute bit up)
1 1 1 rg
1 i
/RelativeColorimetric ri
/R794 gs
0 0 576 719.5 re
f
(filling the image area to be with white color)
q
576 0 0 719.5 0 0 cm
/Im0 Do
Q
(drawing the bitmap image)
1 0 0 1 0 -0.2401 cm
(shift the coordinate system a minute bit down, undoing the initial upshift)
BT
(beginning a text object)
0 0 0 rg
(setting the fill color to black)
/TT1 1 Tf
0.05 Tc
0 Tw
3 Tr
(selecting the font TT1 at size 1, a bit of extra space between characters, no extra space between words, and text rendering mode 3, i.e. invisible)
7.3 0 0 7.3 83.8 678.4401 Tm
(SOFTWARE-PRACTICE ) Tj
(setting the text coordinate system to be shifted by 83.8 horizontally and 678.4401 vertically and to be scaled by 7.3 and drawing some text)
0.08 Tc
7.4 0 0 7.1 175.2 678.4401 Tm
(AND ) Tj
(changing character spacing a bit, setting the text coordinate system to be shifted by 175.2 horizontally and 678.4401 vertically and to be scaled by 7.4 horizontally and 7.1 vertically and drawing some text)
...
TL;DR
What is encoded in the pdf, image or text?
Both, the image plus invisible text upon it.
What is happening to text when I am selecting it?
Atril covers the text in blue and draws the selected (formerly invisible) text upon it in white.

How to generate pdf from a libreoffice calc sheet fitting the page width?

Using LibreOffice 4.1.2.3 in Ubuntu 13.10 I am desperately trying to export the content of a sheet (4 columns) into a pdf (portrait), so all 4 columns fit on a page. A page nicely explains all the settings - but they do not have any effect!
I select all the range I want to export to a pdf (the 4 columns previously mentioned), click File -> Export as PDF, and no matter what I change (e.g. zoom to 7%), the generated pdf contains two pages: One page with the first three columns, and another page with the fourth column.
This is quite cumbersome and ridiculous, and any help is appreciated to solve this problem.
Maybe the LibreOffice Help is misleading here. Those settings (Fit width etc) just affect how to display the resulting PDF. If you want to scale the output to make it fit to a certain number of pages, you will have to modify the page styles's properties: Menu Format -> Page... -> Sheet Tab.
Here, you have three options:
Reduce / enlarge printout: set a fixed scaling factor (e.g. 50 %);
Fit print range(s) to width / heigth: set either the maximum width or maximum heigth in pages, scaling will be proportionally in every case;
Fit print range(s) on number of pages: set the maximum page number.
In your case, just select the third option and set the page number to 1:
I had the same settings as in tohuwawohu's answer, though page still ended too early after column EF, no matter of Scale, Page width or margin settings.
Then I discovered Format -> Print Ranges -> Edit menu with custom range - .
Changing to last column solved my problem. HTH somebody.
Go to File > Print Preview, and adjust the content size with the zoom slider. Click Export and you're done.