Using whole space of pdf file - pdf

I am using prawn to create pdf file but it always leaves some spaces/margins around the page. Can't we use whole space of the pdf file not leaving any margins around?
Thanks !!!

Are you referring to the page bounds ?
The general space that is consumed on the page can be shown by the example code:
require 'prawn/core'
require 'prawn/layout'
Prawn::Document.generate('padded_box.pdf') do
stroke_bounds
text "Margin box"
padded_box(25) do
stroke_bounds
text "Bounding box padded by 25 on all sides from the margins"
padded_box(50) do
stroke_bounds
text "Bounding box padded by 50 on all sides from the parent bounds"
end
end
end
This will draw the bounds of the page showing the margin. There is a gap, which is the margins typically defined for the printing area

Related

iText7 - create PDF with exact dimensions when printed - how?

I'm creating a simple PDF using iText7 (C#) but I need it to be printed at exactly the right size. Here's my code:
PdfWriter writer = new PdfWriter("output.pdf");
PdfDocument pdf = new PdfDocument(writer);
pdf.SetDefaultPageSize(iText.Kernel.Geom.PageSize.LETTER);
var page = pdf.AddNewPage();
page.SetCropBox(new iText.Kernel.Geom.Rectangle(36, 36, 7.5f * 72, 10 * 72));
PdfCanvas canvas = new PdfCanvas(page);
canvas.SetStrokeColor(ColorConstants.BLACK).SetLineWidth(3);
canvas.MoveTo(36, 36);
canvas.LineTo(36, 36 + 72); // Draw a line 1 inch long
canvas.LineTo(36 + 72, 36 + 72); // Draw a second line, perpendicular to the first, also 1 inch long
canvas.ClosePathStroke();
pdf.Close();
If I right-click the resulting PDF and select "Print", my triangle is off the bottom of the page.
When I open the resulting PDF in the PDF program I'm using (PDF Architect), it gives me a few options:
If I just click "Print", it gives me lines that are 1 1/16" long and start about 1/8" from the edge of the page, so by default PDF Architect seems to be taking the contents of my crop box and expanding it to the maximum page availability.
If I click on "Fit" before clicking "Print", that results in the desired output - lines 1" long, starting 1/2" from each side of the page. That works but is error-prone - too easy to forget to click "Fit" every time.
Is there a way to generate a PDF that contains information that says "I'm targeting this document at letter size, but I'm staying 1/2" away from all the edges, so when you print, if the printer has margins <= 1/2 inch you should be fine, and just print it exactly how I've described without any shrinking or enlarging"?
You will not be able to completely control this from the PDF document. The PDF processor (e.g. viewing application) or the printer (driver) will always be able to scale the content up or down.
Apparently, PDF Architect has the "Fit" option enabled by default, so it scales the page to the selected paper size.
You are setting a crop box of 7.5x10 in. I assume you're printing to Letter sized (8.5x11 in) paper. So the 7.5x10 page will indeed be scaled up, and your content will become slightly larger.
Is there a way to generate a PDF that contains information that says "I'm targeting this document at letter size, but I'm staying 1/2" away from all the edges, so when you print, if the printer has margins <= 1/2 inch you should be fine, and just print it exactly how I've described without any shrinking or enlarging"?
I would not set the crop box. When the pages in the PDF document are Letter size and the output paper is also Letter size, it should not matter whether the "Fit" option is enabled or not, as not scaling needs to happen. It's definitely not a fool proof solution, but at least it's less error prone.

Word Print Preview not honoring my SSRS format settings

<InteractiveHeight>11in</InteractiveHeight>
<InteractiveWidth>8.5in</InteractiveWidth>
<LeftMargin>1in</LeftMargin>
<RightMargin>1in</RightMargin>
<TopMargin>1in</TopMargin>
<BottomMargin>1in</BottomMargin>
When I print preview after exporting it to word:
Custom Page Size 10.36" x 11"
My margins are fine, but the wording goes off the page and gets cut off.
Is there any way to force the page size to be 8.5x11?
Check if some of the elements are wider than thr max width. Usually when I have this issue it turns out that a page header/footer/tablix is a few milimeters longer than it should. Also, check if the content AND the margins fit in the dimension for the word page you want.

Lost some text when extracting pdf

I've tried to get all the text on the page by using iText, but I have no idea why every coordinate text loses the last two character.
PdfDocument pdfDoc = new PdfDocument(new PdfReader(#"E:\Coding\COOR.pdf"));
LocationTextExtractionStrategy strategy = new LocationTextExtractionStrategy();
PdfCanvasProcessor parser = new PdfCanvasProcessor(strategy);
parser.ProcessPageContent(pdfDoc.GetFirstPage());
Console.Write(strategy.GetResultantText());
pdfDoc.Close();
Console.WriteLine("Great!");
Console.ReadKey();
You can also download my code from
https://1drv.ms/u/s!Al1hUSZtR4OjwU3XVBRQGneVaZlS
In short
The reason for that "lost text" is that the missing "text" isn't there to start with!
In detail
The contents of you PDF file are constructed in a misleading manner.
On the one hand there are very many path definitions which then are stroked (drawn). These drawings create what you can see in a viewer, both text and table lines.
On the other hand there are a few text drawing instructions to draw text using text rendering mode 3 which is... invisible! These drawings create the text you can copy&paste in a viewer or extract using iText.
Unfortunately the text in the text drawing instructions and the text drawn using paths does not match completely. The text you retrieve via copy&paste or text extraction, therefore, differs from your expectations.
Also the glyph sizes and positions are not exactly the same
To illustrate this I made the text drawing instructions use the normal (fill) text rendering mode. The top left corner which originally looks like this:
with that change looks like this:
As you see the formerly invisible text is only approximately at the same position as the visible drawings, and it is somewhat broken: The symbol for degrees is weirdly represented as "¡ã", and the longitude fractional seconds and the following symbol for seconds are missing.
To correctly extract the originally visible data, you'll need to use OCR instead of text extraction.

Drawing a second text below the first text

I would like to draw 2 texts onto my PDF.
The first text should be aligned to the top left corner.
This works fine.
I'm using:
canvas = stamper.GetOverContent(i)
watermarkFont = iTextSharp.text.pdf.BaseFont.CreateFont(iTextSharp.text.pdf.BaseFont.HELVETICA, iTextSharp.text.pdf.BaseFont.CP1252, iTextSharp.text.pdf.BaseFont.NOT_EMBEDDED)
watermarkFontColor = iTextSharp.text.BaseColor.RED
canvas.MoveTo(0, 0) 'I think the canvas is the space that we draw onto. My documents always start at position X=0 and Y=0, so move to 0,0 should be fine
canvas.BeginText()
canvas.SetFontAndSize(watermarkFont, 12)
canvas.SetColorFill(watermarkFontColor)
canvas.ShowTextAligned(Element.ALIGN_TOP, uText, 0, 830, 0) 'is 830 the width of the available space?
canvas.EndText()
Now I would like to draw another text approximately 100 pixels below the first text.
I'm using:
canvas.MoveTo(0, 100) 'let's draw the second text at X=100, Y=100
canvas.BeginText()
canvas.SetFontAndSize(watermarkFont, 12)
canvas.SetColorFill(watermarkFontColor)
canvas.ShowTextAligned(Element.ALIGN_CENTER, uBewirtung, 0, 830, 0)
canvas.EndText()
The second text however doesn't show up at all.
I suspect I'm drawing outside the document, but I don't see my mistake.
The MoveTo() method is meant for drawing paths (lines amd shapes in graphics state), not text (in text state). It adds an m operator to the content stream. If you are a PDF specialist, you should use the SetTextMatrix() method inside your BT/ET text block: What does setTextMatrix of contentByte class in iText do?
Note the if; it is important. If you are not a PDF specialist, you shouldn't be toying around with those methods. You should use ColumnText.ShowTextAligned() instead of BeginText(), EndText() and all of the lines you added in-between. Those methods are meant for people who speak PDF syntax.

PostScript code to un-hide hidden text in PDF

I have a PDF with some hidden text in it.
When I press [CTRL+a] I see the hidden text in my document viewer.
I can copy the text too and I can extract the text via pdftotext, but I can't recolorize the text so I can view the hidden text in the PDF viewer without pressing [CTRL+a].
So I had the idea, that I could use PostScript and change the color for the this text object.
But how can I determine what function sets the color or hides the text?
You cannot use PostScript to achieve what you want. You need to resort to manually editing the PDF file...
There are basically three ways to "hide" text:
It could be white (or any color) text on white (or same color as text) background.
It could be covered by another object, say, a white area, or an image.
It could be using Text Rendering Mode 3 ("3 Tr").
The first two cases I'll not explain here, because they are rather unlikely. For the third case you could proceed like this:
Use qpdf to unpack as many as possible compressed 'streams' inside the PDF, creating what qpdf calls the 'QDF mode' of a PDF:
qpdf --qdf --object-streams=disable input.pdf uncompressed.pdf
Open uncompressed.pdf in a good text editor, such as VIm.
Search for the sequence 3 Tr.
(Text rendering mode 3 is described in the PDF-1.7 specification as "Neither fill nor stroke text (invisible).")
Change it to 1 Tr or 2 Tr and save the file.
(Text rendering mode 1 is "stroke text", mode 2 is "Fill, then stroke text." Mode 1 will only show the outlines...)
Re-compress the file:
qpdf uncompressed.pdf input-modified.pdf
Open the new file input-modified.pdf in your favourite PDF viewer. It should now show the "un-hidden" text.
Update
Having received a sample of a PDF file with "hidden" text from the OP (via private channels), I can confirm now that the hiding indeed is achieved by using white text color (RGB-white).
To make such text visible:
Unpack the PDF, using qpdf --qdf --object-streams=disable in.pdf unpacked.pdf
Search for all occurrences of 1 1 1 rg and 1 1 1 RG. These set the RGB colors to white (the first one non-stroking, the second one for stroking operations).
Comments à la %%Contents for page N: in the QDF-version of the uncompressed PDF file will indicate for which page the color setting is valid. (Note, there may be multiple occurrences of the rg and RG operators, each one setting a different (or the same) color for the next drawing operation.)
Now replace the white colors by black ones, by overwriting the found occurrences with 0 0 0 rg and 0 0 0 RG. Do this not all at once, but one after the other and observe what changes on the respective page after saving the changes. (You may want to avoid painting white text to black if it is on a black background already!)
Firstly, hidden text in PDF is done with a text rendering mode, not a colour. Text rendering mode 3 is 'neither stroke nor fill'. So changing the colour won't help you if this is how the text is drawn. Of course we can't tell if this is how the text has been drawn (but I suspect it is) because you haven't made the PDF file publicly available. In almost all cases if you want to discuss a particular file the best thing to do is make it public.
Secondly, you can't use PostScript to change a PDF file (well, you could write a PostScript program to interpret the PDF file, but that would be hard...)