iTextSharp - Fit formatted Text to single page - pdf

I have a text file that is pre-formatted with spacing and line breaks. I am writing the text out to a blank pdf that has been set to landscape with minimal margins to fit all the text to a single page, however I am still running off the page. Can anyone recommend how I can use itextsharp to dynamically "fit to page" by reducing the font size and/or line-height (lead). I have seen responses about using a textfield or rectangles but I can't seem to get those working properly.
Update: Here is what I have so far that uses no advanced stuff at all, simply margin control and font size adjustments to force my sample text to the page. This works fine if I always have fixed line lengths, but that unfortunately won't be the case. There might be a common max line length I can use across the files but I don't have that data at this time.
private void CreatePDF()
{
string line = string.Empty;
StreamReader sr = new StreamReader(#"C:\dev\text1.txt");
StringBuilder sb = new StringBuilder();
string newFile = #"C:\dev\testPDF1.pdf";
Document pdfDoc = new Document(PageSize.LETTER.Rotate(), 50, 5, 5, 5);
PdfWriter writer = PdfWriter.GetInstance(pdfDoc, new FileStream(newFile, FileMode.OpenOrCreate));
pdfDoc.Open();
while ((line = sr.ReadLine()) != null)
{
if (line != "\f")
{
sb.AppendLine(line);
}
else
{
pdfDoc.Add(new Paragraph(sb.ToString(), new Font(Font.NORMAL, 6)));
pdfDoc.NewPage();
pdfDoc.SetPageSize(PageSize.LETTER.Rotate());
pdfDoc.SetMargins(50, 5, 5, 5);
sb.Clear();
sb.AppendLine("");
}
}
pdfDoc.Add(new Paragraph(sb.ToString(), new Font(Font.NORMAL, 6)));
pdfDoc.Close();
//Console.Write(sb);
}

Related

Arabic characters not showing while creating PDF in Windows Phone (WinRT)

I am creating a PDF using componentOne library to create a PDF in my universal application, this is creating a PDF fine for English, but when I create a PDF in Arabic it starts giving garbage characters instead of Arabic characters.
This looks like an encoding issue to me. Please let me know how do we solve such issues generally for creating PDF, even if you do not know about componentOne library. I might pick the clue.
EDIT:
Download Stripped down code:
http://1drv.ms/1ABAuqi
Code:
async void CreatePdfDocument()
{
try
{
var pdf = new C1PdfDocument(PaperKind.Letter);
pdf.Landscape = false;
// measure and show some text
var text = App.GetResource("DocumentHeading");
var font = new Font("Segoe UI Light", 36, PdfFontStyle.Bold);
var fmt = new StringFormat();
fmt.Alignment = HorizontalAlignment.Center;
// measure it
var sz = pdf.MeasureString(text, font, 72 * 3, fmt);
var rc = new Rect(0, 0, pdf.PageRectangle.Width, sz.Height);
rc = PdfUtils.Offset(rc, 0, 0);
// draw the text
pdf.DrawString(text, font, Colors.Orange, rc, fmt);
}
catch (Exception e)
{
}
}

How to add a rich Textbox (HTML) to a table cell?

I have a rich text box named:”DocumentContent” which I’m going to add its content to pdf using the below code:
iTextSharp.text.Font font = FontFactory.GetFont(#"C:\Windows\Fonts\arial.ttf", BaseFont.IDENTITY_H, BaseFont.EMBEDDED, 12f, Font.NORMAL, BaseColor.BLACK);
DocumentContent = System.Web.HttpUtility.HtmlDecode(DocumentContent);
Chunk chunkContent = new Chunk(DocumentContent);
chunkContent.Font = font;
Phrase PhraseContent = new Phrase(chunkContent);
PhraseContent.Font = font;
PdfPTable table = new PdfPTable(2);
table.WidthPercentage = 100;
PdfPCell cell;
cell = new PdfPCell(new Phrase(PhraseContent));
cell.Border = Rectangle.NO_BORDER;
table.AddCell(cell);
The problem is when I open PDF file the content appears as HTML not a text as below:
<p>Overview  line1 </p><p>Overview  line2
</p><p>Overview  line3 </p><p>Overview 
line4</p><p>Overview  line4</p><p>Overview 
line5 </p>
But it should look like below
Overview line1
Overview line2
Overview line3
Overview line4
Overview line4
Overview line5
What I'm going to do is to keep all the styling which user apply to the rich text and just change font family to Arial.
I can change Font Family but I need to Decode this content from HTML to Text.
Could you please advise?
Thanks
Please take a look at the HtmlContentForCell example.
In this example, we have the HTML you mention:
public static final String HTML = "<p>Overview line1</p>"
+ "<p>Overview line2</p><p>Overview line3</p>"
+ "<p>Overview line4</p><p>Overview line4</p>"
+ "<p>Overview line5 </p>";
We also create a font for the <p> tag:
public static final String CSS = "p { font-family: Cardo; }";
In your case, you may want to replace Cardo with Arial.
Note that we registered the regular version of the Cardo font:
FontFactory.register("resources/fonts/Cardo-Regular.ttf");
If you need bold, italic and bold-italic, you also need to register those fonts of the same Cardo family. (In case of arial, you'd register arial.ttf, arialbd.ttf, ariali.ttf and arialbi.ttf).
Now we can parse this HTML and CSS into a list of Element objects with the parseToElementList() method. We can use these objects inside a cell:
PdfPTable table = new PdfPTable(2);
table.addCell("Some rich text:");
PdfPCell cell = new PdfPCell();
for (Element e : XMLWorkerHelper.parseToElementList(HTML, CSS)) {
cell.addElement(e);
}
table.addCell(cell);
document.add(table);
See html_in_cell.pdf for the resulting PDF.
I do not have the time/skills to provide this example in iTextSharp, but it should be very easy to port this to C#.
Finally I write this code in c# which is working perfectly, Thanks to Bruno who helped me to understand XMLWorker.
Here is an example using XMLWorker in C#.
I used a sample HTML as below:
public static string HTML = "<p>Overview line1âââŵẅẃŷûâàêÿýỳîïíìôöóòêëéèẁẃẅŵùúúüûàáäâ</p>"
+ "<p>Overview line2</p><p>Overview line3</p>"
+ "<p>Overview line4</p><p>Overview line4</p>"
+ "<p>Overview line5 </p>";
I have created Test.css file and saved it in SharePoint Style Library. (for this test I saved it in D drive to keep it simple)
Here is the content of my test css file:
p { font-family: arial; }
Then using the below c# code I saved the PDF file in D drive. ( In SharePoint I used Memorystream. I keep this example very simple to understand )
string fileName = #"D:\Test.pdf";
var css = #"D:\Test.css";
using (var ActionStream = new MemoryStream(UTF8Encoding.UTF8.GetBytes(HTML)))
{
using (FileStream cssFile = new FileStream(css, FileMode.Open))
{
var document = new Document(PageSize.A4, 30, 30, 10, 10);
var worker = XMLWorkerHelper.GetInstance();
var writer = PdfWriter.GetInstance(document, new FileStream(fileName, FileMode.Create));
document.Open();
worker.ParseXHtml(writer, document, ActionStream, cssFile);
writer.CloseStream = false;
document.Close();
}
}
It creates Test.pdf file adding my HTML with Font Family:Arial. So all of the Welsh Characters can be saved in PDF file.
Note: I have added iTextSharp.dll v:5.5.3 and XMLworker.dll v: 5.5.3 to my project.
using iTextSharp.text;
using iTextSharp.text.html;
using iTextSharp.text.pdf;
using iTextSharp.tool.xml;
using iTextSharp.tool.xml.css;
using iTextSharp.tool.xml.html;
using iTextSharp.tool.xml.parser;
using iTextSharp.tool.xml.pipeline;
Hope this can be useful.
Kate

ITextSharp adding text. Some text not showing up

I am adding text to an already created pdf document using this method.
ITextSharp insert text to an existing pdf
Basically it uses the PdfContentByte and then adds the content template to the page.
I am finding that in some areas of the file, the text doesn't show up.
It seems that the text I am adding is showing up behind the content that is already on the page? I flattened the pdf document down to it just being images but I am still having the same issue happen with the flattened file.
Has anyone had any issues adding text being hidden using Itextsharp?
I also tried using DirectContentUnder as was suggested in this link to no avail..
iTextSharp hides text when write
Here is the code I am using...With this I am trying to basically overlay graph paper on top of the PDF. In this example, there is a box in the upper left corner of every page that doesn't get populated. There is an image in the original pdf in this spot. And on the 4th and 5th pages, there are boxes that don't get populated, but they don't seem to be images.
PdfReader reader = new PdfReader(oldFile);
iTextSharp.text.Rectangle size = reader.GetPageSizeWithRotation(1);
Document document = new Document(size);
// open the writer
FileStream fs = new FileStream(newFile, FileMode.Create, FileAccess.Write);
PdfWriter writer = PdfWriter.GetInstance(document, fs);
document.Open();
// the pdf content
PdfContentByte cb = writer.DirectContent;
for (int i = 0; i < reader.NumberOfPages; i++)
{
document.NewPage();
// select the font properties
BaseFont bf = BaseFont.CreateFont(BaseFont.HELVETICA_BOLD, BaseFont.CP1252, BaseFont.NOT_EMBEDDED);
cb.SetFontAndSize(bf, 4);
cb.SetColorStroke(BaseColor.GREEN);
cb.SetLineWidth(1f);
for (int j = 10; j < 600; j += 10)
{
WriteToDoc(ref cb, j.ToString(), j, 10);//Write the line number
WriteToDoc(ref cb, j.ToString(), j, 780);//Write the line number
if (j % 20 == 0)
{
cb.MoveTo(j, 20);
cb.LineTo(j, 760);
cb.Stroke();
}
}
for (int j = 10; j < 800; j += 10)
{
WriteToDoc(ref cb, j.ToString(), 5, j);//Write the line number
WriteToDoc(ref cb, j.ToString(), 590, j);//Write the line number
if (j % 20 == 0)
{
cb.MoveTo(15, j);
cb.LineTo(575, j);
cb.Stroke();
}
}
// create the new page and add it to the pdf
PdfImportedPage page = writer.GetImportedPage(reader, i + 1);
cb.AddTemplate(page, 0, 0);
}
// close the streams and voilá the file should be changed :)
document.Close();
fs.Close();
writer.Close();
reader.Close();
Thanks for any of the help you can provide...I really appreciate it!
-Greg
First of all: If you are trying to basically overlay graph paper on top of the PDF, why do you first draw the graph paper and stamp the original page onto it? You essentially are underlaying graph paper, not overlaying it.
Depending on the content of the page, your graph paper this way may easily get covered. E.g. if there is a filled rectangle in the page content, in the result there is a box in the upper left corner of every page that doesn't get populated.
Thus, simply first add the old page content, then add overlay changes.
This being said, for the task of applying changes to an existing PDF, using PdfWriter and GetImportedPage is less than optimal. This actually is a task for the PdfStamper class which its made for stamping additional content on existing PDFs.
E.g. have a look at the sample StampText, the pivotal code being:
PdfReader reader = new PdfReader(resource);
using (var ms = new MemoryStream())
{
using (PdfStamper stamper = new PdfStamper(reader, ms))
{
PdfContentByte canvas = stamper.GetOverContent(1);
ColumnText.ShowTextAligned( canvas, Element.ALIGN_LEFT, new Phrase("Hello people!"), 36, 540, 0 );
}
return ms.ToArray();
}

Insert PDF in PDF (NOT merging files)

I'd like to insert a PDF page in another PDF page scaled. I'd like to use iTextSharp for this.
I have a vector drawing which can be exported as a single page PDF file. I would like to add this file into a page of other PDF document just like I would add an image to a PDF document.
Is this possible?
The purpose of this is to retain the ability to zoom in without losing quality.
It is very hard to reproduce the vector drawing using PDF vectors because it is an extremely complex drawing.
Exporting the vector drawing as high resolution image is not an option since I have to use a lot of them in a single PDF document. The final PDF would be very large and its writing too slow.
This is relatively easy to do although there's a couple of ways to go about it. If you're creating a new document that has the other documents inside of it and nothing else then the easiest thing to use is probably the PdfWriter.GetImportedPage(PdfReader, Int). This will give you a PdfImportedPage (which inherits from PdfTemplate). Once you have that you can add it to your new document by using PdfWriter.DirectContent.AddTemplate(PdfImportedPage, Matrix).
There's a couple of overloads to AddTemplate() but the easiest one (at least for me) is the one that takes a System.Drawing.Drawing2D.Matrix. If you use this you can easily scale and translate (change x,y) without having to think in "matrix" terms.
Below is sample code that shows this off. It targets iTextSharp 5.4.0 although it should work pretty much the same with 4.1.6 if you remove the using statements. It first creates a sample PDF with 12 pages with random background colors. Then it creates a second document and adds each page from the first PDF scaled by 50% so that 4 old pages fit onto 1 new page. See the code comments for further details. This code assumes that all pages are the same size, you might need to perform further calculations if your situation differs.
//Test files that we'll be creating
var file1 = Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Desktop), "File1.pdf");
var file2 = Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Desktop), "File2.pdf");
//For test purposes we'll fill the pages with a random background color
var R = new Random();
//Standard PDF creation, nothing special here
using (var fs = new FileStream(file1, FileMode.Create, FileAccess.Write, FileShare.None)) {
using (var doc = new Document()) {
using (var writer = PdfWriter.GetInstance(doc, fs)) {
doc.Open();
//Create 12 pages with text on each one
for (int i = 1; i <= 12; i++) {
doc.NewPage();
//For test purposes fill the page with a random background color
var cb = writer.DirectContentUnder;
cb.SaveState();
cb.SetColorFill(new BaseColor(R.Next(0, 256), R.Next(0, 256), R.Next(0, 256)));
cb.Rectangle(0, 0, doc.PageSize.Width, doc.PageSize.Height);
cb.Fill();
cb.RestoreState();
//Add some text to the page
doc.Add(new Paragraph("This is page " + i.ToString()));
}
doc.Close();
}
}
}
//Create our combined file
using (var fs = new FileStream(file2, FileMode.Create, FileAccess.Write, FileShare.None)) {
using (var doc = new Document()) {
using (var writer = PdfWriter.GetInstance(doc, fs)) {
//Bind a reader to the file that we created above
using (var reader = new PdfReader(file1)) {
doc.Open();
//Get the number of pages in the original file
int pageCount = reader.NumberOfPages;
//Loop through each page
for (int i = 0; i < pageCount; i++) {
//We're putting four original pages on one new page so add a new page every four pages
if (i % 4 == 0) {
doc.NewPage();
}
//Get a page from the reader (remember that PdfReader pages are one-based)
var imp = writer.GetImportedPage(reader, (i + 1));
//A transform matrix is an easier way of dealing with changing dimension and coordinates on an rectangle
var tm = new System.Drawing.Drawing2D.Matrix();
//Scale the image by half
tm.Scale(0.5f, 0.5f);
//PDF coordinates put 0,0 in the bottom left corner.
if (i % 4 == 0) {
tm.Translate(0, doc.PageSize.Height); //The first item on the page needs to be moved up "one square"
} else if (i % 4 == 1) {
tm.Translate(doc.PageSize.Width, doc.PageSize.Height); //The second needs to be moved up and over
} else if (i % 4 == 2) {
//Nothing needs to be done for the third
} else if (i % 4 == 3) {
tm.Translate(doc.PageSize.Width, 0); //The fourth needs to be moved over
}
//Add our imported page using the matrix that we set above
writer.DirectContent.AddTemplate(imp,tm);
}
doc.Close();
}
}
}
}
In addition; while i was trying to add a rotated pdf to a rotated pdf, i got some rotation problems. Kind of confusing but you should check the "PdfImportedPage.Rotation" of the page which is gonna be added to pdf.
PdfImportedPage page;//page = writer.GetImportedPage(PdfReader reader, int pageNum);
PdfContentByte pcb;//pcb = PdfWriter.DirectContentUnder;
//create matrix to use for rotating imported page
Matrix matrix = new Matrix(a, b, c, d, e, f);
matrix.Rotate(-(page.Rotation));
if (page.Rotation != 0)
pcb.AddTemplate(page, matrix, true);
else
pcb.AddTemplate(page, a, b, c, d, e, f, true);
code looks like silly but i want to get your attention on "matrix.Rotate(negative rotation of imported page)"

How to exactly position an Image inside an existing PDF page using PDFBox?

I am able to insert an Image inside an existing pdf document, but the problem is,
The image is placed at the bottom of the page
The page becomes white with the newly added text showing on it.
I am using following code.
List<PDPage> pages = pdDoc.getDocumentCatalog().getAllPages();
if(pages.size() > 0){
PDJpeg img = new PDJpeg(pdDoc, in);
PDPageContentStream stream = new PDPageContentStream(pdDoc,pages.get(0));
stream.drawImage(img, 60, 60);
stream.close();
}
I want the image on the first page.
PDFBox is a low-level library to work with PDF files. You are responsible for more high-level features. So in this example, you are placing your image at (60, 60) starting from lower-left corner of your document. That is what stream.drawImage(img, 60, 60); does.
If you want to move your image somewhere else, you have to calculate and provide the wanted location (perhaps from dimensions obtained with page.findCropBox(), or manually input your location).
As for the text, PDF document elements are absolutely positioned. There are no low-level capabilities for re-flowing text, floating or something similar. If you write your text on top of your image, it will be written on top of your image.
Finally, for your page becoming white -- you are creating a new content stream and so overwriting the original one for your page. You should be appending to the already available stream.
The relevant line is:
PDPageContentStream stream = new PDPageContentStream( pdDoc, pages.get(0));
What you should do is call it like this:
PDPageContentStream stream = new PDPageContentStream( pdDoc, pages.get(0), true, true);
The first true is whether to append content, and the final true (not critical here) is whether to compress the stream.
Take a look at AddImageToPDF sample available from PDFBox sources.
Try this
doc = PDDocument.load( inputFileName );
PDXObjectImage ximage = null;
ximage = new PDJpeg(doc, new FileInputStream( image )
PDPage page = (PDPage)doc.getDocumentCatalog().getAllPages().get(0);
PDPageContentStream contentStream = new PDPageContentStream(doc, page, true, true);
contentStream.drawImage( ximage, 425, 675 );
contentStream.close();
This prints the image in first page. If u want to print in all pages just put on a for loop with a condition of number of pages as the limit.
This worked for me well!
So late answer but this is for who works on it in 2020 with Kotlin: drawImage() is getting float values inside itself so try this:
val file = File(getPdfFile(FILE_NAME))
val document = PDDocument.load(file)
val page = document.getPage(0)
val contentStream: PDPageContentStream
contentStream = PDPageContentStream(document, page, true, true)
// Define a content stream for adding to the PDF
val bitmap: Bitmap? = ImageSaver(this).setFileName("sign.png").setDirectoryName("signature").load()
val mediaBox: PDRectangle = page.mediaBox
val ximage: PDImageXObject = JPEGFactory.createFromImage(document, bitmap)
contentStream.drawImage(ximage, mediaBox.width - 4 * 65, 26f)
// Make sure that the content stream is closed:
contentStream.close()
// Save the final pdf document to a file
pdfSaveLocation = "$directoryPDF/$UPDATED_FILE_NAME"
val pathSave = pdfSaveLocation
document.save(pathSave)
document.close()
I am creating a new PDF and running below code in a loop - to add one image per page and below co-ordinates and height and width values work well for me.
where out is BufferedImage reference variable
PDPage page = new PDPage();
outputdocument.addPage(page);
PDPageContentStream contentStream = new PDPageContentStream(outputdocument, page, AppendMode.APPEND, true);
PDImageXObject pdImageXObject = JPEGFactory.createFromImage(outputdocument, out);
contentStream.drawImage(pdImageXObject, 5, 2, 600, 750);
contentStream.close();
This link gives you details about Class PrintImageLocations.
This PrintImageLocations will give you the x and y coordinates of the images.
Usage: java org.apache.pdfbox.examples.util.PrintImageLocations input-pdf