Adding an imported PDF to a table cell in iTextSharp - pdf

I am creating a new PDF that will contain a compilation of other documents.
These other documents can be word/excel/images/PDF's.
I am hoping to add all of this content to cells in a table, which is added to the document - this gives me the goodness of automatically adding pages, positioning elements in a cell rather than a page and allowing me an easier life at keeping content in the same order as i supply (such as img, doc, pdf, img, pdf etc)
Adding images to the table is simple enough.
I am converting the word/excel docs to PDF image streams. I'm also reading in the existing PDF's as a stream.
Adding these to a new PDF is simple enough - by way of adding a template to the PdfContent byte.
What I am trying to do though is add these PDF's to cells in a table, which are then added to the doc.
Is this possible?

Please download chapter 6 of my book. It contains two variations on what you are trying to do:
ImportingPages1, with as result time_table_imported1.pdf
ImportingPages2, with as result time_table_imported2.pdf
This is a code snippet:
// step 1
Document document = new Document();
// step 2
PdfWriter writer
= PdfWriter.getInstance(document, new FileOutputStream(RESULT));
// step 3
document.open();
// step 4
PdfReader reader = new PdfReader(MovieTemplates.RESULT);
int n = reader.getNumberOfPages();
PdfImportedPage page;
PdfPTable table = new PdfPTable(2);
for (int i = 1; i <= n; i++) {
page = writer.getImportedPage(reader, i);
table.getDefaultCell().setRotation(-page.getRotation());
table.addCell(Image.getInstance(page));
}
document.add(table);
// step 5
document.close();
reader.close();
The pages are imported as PdfImportedPage objects, and then wrapped inside an Image so that we can add them to a PdfPTable.

Related

huge data export in pdf itextsharp

I try to export data in PDF there is huge data so when i export ..here i dont export data from gridview but actually here i create dummy gridview in code and bind data in that grid.. i am not displaying the data the grid in page .. I try below code
Private Sub ExportGridToPDF()
Using myMemoryStream As New MemoryStream()
Dim myDocument As New iTextSharp.text.Document(iTextSharp.text.PageSize.A1, 10.0F, 10.0F, 10.0F, 0.0F)
' Dim myDocument As New iTextSharp.text.Document()
Dim myPDFWriter As PdfWriter = PdfWriter.GetInstance(myDocument, myMemoryStream)
myDocument.Open()
' Add to content to your PDF here...
Dim sw As New StringWriter()
Dim hw As New HtmlTextWriter(sw)
GridView1.AllowPaging = False
GridView1.DataBind()
GridView1.RenderControl(hw)
' We're done adding stuff to our PDF.
myDocument.Add(hw)
myDocument.Close()
Dim content As Byte() = myMemoryStream.ToArray()
' Write out PDF from memory stream.
Using fs As FileStream = File.Create("eport_PDF.pdf")
fs.Write(content, 0, CInt(content.Length))
End Using
End Using
End Sub
when i run this shows an error
System.InvalidCastException: Unable to cast object of type 'System.Web.UI.HtmlTextWriter' to type 'iTextSharp.text.IElement'.
on this line
myDocument.Add(hw)
I use memory stream because of huge data when i use code without memory stream then shows an error Out of Memory exception so i use memory stream and now this shows different error
The Add() method in the Document object only accepts parameters that implement the IElement interface. You are passing an HtmlTextWriter object. That object is totally unrelated to iText. It is truly amazing that you would think this could work.
In this question, as in previous questions you posted (some of which are deleted), you refer to HTML. You were using HTMLWorker in Add image using itextsharp and the deleted question Out Of Memory Exception error itext sharp.
If you want to convert HTML to PDF, you should upgrade to iText 7 and use the pdfHTML add-on. Take a look at the tutorial to see how HTML to PDF conversion is done: https://developers.itextpdf.com/content/itext-7-converting-html-pdf-pdfhtml
In a comment to this answer however, you write: I'm not exporting data in HTML to PDF. OK, if that's true, then why do you refer to HTML in your code? That's very confusing.
Furthermore, you write I create dummy grid-view in code and bind data in it. Unfortunately, you don't give us any information about the format of that dummy grid-view. I assume, it's something you "invented" yourself, but if that's the case, how do you suppose that iText can magically understand the dummy grid-view you invented?
I started this answer by saying the the Add() method only accepts objects that implement the IElement interface. Since you are talking about a grid, it's probably interesting to use an iText table element. In iText 5, there's an object named PdfPTable; in iText 7, that object is simply named Table.
Many people with large data sets, create such a table object first, then add it to a Document. That's not always wise, because objects keep building up in memory, eventually resulting in an OutOfMemoryException. For large data sets, you should mark the table as a large element, and add the table gradually.
In iText 5, the code would look like this:
Document document = new Document();
FileStream stream = new FileStream(fileName, FileMode.Create);
var pdfWriter = PdfWriter.GetInstance(document, stream);
document.Open();
PdfPTable table = new PdfPTable(4);
table.Complete = false;
for (int i = 0; i < 1000000; i++) {
PdfPCell cell = new PdfPCell(new Phrase(i.ToString()));
table.AddCell(cell);
if (i > 0 && i % 1000 == 0) {
document.Add(table);
}
}
table.Complete = true;
document.Add(table);
document.Close();
We're adding 1000000 cells to a table with 4 columns, but we add the table every 1000 cells (so every 250 rows). This means that the content is flushed from memory on a regular basis, thus avoiding an OutOfMemoryException.
Since you seem to be new at iText, do yourself a favor, and upgrade to using iText 7. iText 5 is in maintenance mode, which means that no new functionality will be added to that version. For instance: if at some point someone asks you to produce PDF 2.0 files (the PDF 2.0 spec was released a couple of months ago), you will have to throw all your iText 5 code away, and start anew, because only iText 7 will support PDF 2.0.
The large table functionality in iText 7, is discussed at the end of chapter 5 of the tutorial:
PdfDocument pdf = new PdfDocument(new PdfWriter(dest));
Document document = new Document(pdf);
Table table = new Table(new[] {1f, 1f, 1f}, true);
table.AddHeaderCell("Table header 1");
table.AddHeaderCell("Table header 2");
table.AddHeaderCell("Table header 3");
table.AddFooterCell("Table footer 1");
table.AddFooterCell("Table footer 2");
table.AddFooterCell("Table footer 3");
document.Add(table);
for (int i = 0; i < 1000; i++)
{
table.AddCell($"Row {i + 1}; column 1");
table.AddCell($"Row {i + 1}; column 2");
table.AddCell($"Row {i + 1}; column 3");
if (i % 50 == 0)
{
table.Flush();
}
}
table.Complete();
document.Close();
As you can see, the iText 7 code is much more intuitive. We create a table with 3 columns, and the second parameter (true) indicates that we will add a very large table. We add a header, we add a footer, and we add the table to the document. Then we add 1000 rows, but we Flush() the table every 50 rows. Flushing free memory, avoiding going out of memory. Once we're done, we Complete() the table.
All of this is documented on the official web site! There is no need for you to invent your own grid view. As you have found out, inventing your own grid view cannot possibly work.
Also important: you say iTextSharp, I say iText. We both mean the same thing: the PDF library produced by iText Group that can be used to create PDF documents from C# code. Only you are using the old name, whereas we try to avoid that name based on the advice of a Trademark who told us that there's a company named Sharp that doesn't appreciate other companies using the word Sharp in the context of brands that aren't related to their company. So please stop saying that you're using iTextSharp; you're using iText!

In iTextSharp, how to include an existing PDF while creating a new document

I've found many solutions in here and in the 'iText in Action' book, to merge PDF's using the PDFCopy and PDFSmartCopy classes, but the only similar question asked I've seen, the guy worked it out himself but didn't post the answer. This post Add an existing PDF from file to an unwritten document using iTextSharp asks the same question but its at the end, so they suggest closing the existing document and then use PDFCopy, here I'd like to insert it anywhere. So here goes.
I'm creating an iTextSharp document with text and images using normal Sections, Phrases, Document and PDFWriter classes. This is code written over many years and works fine. Now we need to insert an existing PDF while creating this document as either a new Section or Chapter if that isn't possible. I have the PDF as a Byte array, so no problems getting a PDFReader. However, I cannot work out how to read that PDF and insert it into the existing document at the point I'm at. I can get access to the PDFWriter if need be, but for the rest of the document all access is via Sections. This is as far as I've got and I can add the PDFWriter as another parameter if necessary.
I've made some progress since the original post and amend the code accordingly.
internal static void InsertPDF( Section section, Byte[] pdf )
{
this.document.NewPage();
PdfReader pdfreader = new PdfReader( pdf );
Int32 pages = pdfreader.NumberOfPages;
for ( Int32 page = 1; page <= pages; page++ )
{
PdfImportedPage page = this.writer.GetImportedPage( planreader, pagenum );
PdfContentByte pcb = this.writer.DirectContentUnder;
pcb.AddTemplate( page, 0, 0 );
this.document.NewPage();
}
}
It is close to doing what I want, but as I obviously don't understand the full workings of iText wonder if this is the correct way or there is a better way to do it.
If there is any other information I can provide, let me know.
Any pointers would be appreciated.
Just adding a little more meat to the answer. The solution ended up being found by researching what methods worked with a PdfTemplate which is what a PdfImportedPage is derived from. I've added a little more to show how it interacts with the rest of the document being built up. I hope this helps someone else.
internal static void InsertPDF( PdfWriter writer, Document document, Section section, Byte[] pdf )
{
Paragraph para = new Paragraph();
// Add note to show blank page is intentional
para.Add( new Phrase( "PDF follows on the next page.", <your font> ) );
section.Add( para );
// Need to update the document so we render this page.
document.Add( section );
PdfReader reader = new PdfReader( pdf );
PdfContentByte pcb = writer.DirectContentUnder;
Int32 pages = planreader.NumberOfPages;
for ( Int32 pagenum = 1; pagenum <= pages; pagenum++ )
{
document.NewPage();
PdfImportedPage page = writer.GetImportedPage( reader, pagenum );
// Render their page in our document.
pcb.AddTemplate( page, 0, 0 );
}
}
for insert existing pdf into new page, i've change order newpage
PdfImportedPage page2 = writer.GetImportedPage(pdf, 1);
cb.AddTemplate(page2, 0, 0);
document.NewPage();

Use PDFBox to Merge Pages?

I know I can use PDFBox to merge multiple PDF's into one PDF. But is there a way to merge pages? For example, I have a header in PDF and want it to be inserted to the top of the first page of the combined PDF and push everything down. Is there a way to do it using PDFBox API?
Here is some code that works to copy two files into a merged one with multiple copies of each one. It copies by pages. It's something I got using the information in the answer to this question: Can duplicating a pdf with PDFBox be small like with iText?
So all you have to do is to make one copy only of the first page of doc1 and one copy only of all pages of doc2. There's a comment where you'll have to make a change to leave off some pages.
final int COPIES = 1; // total copies
// Same code as linked answer mostly
PDDocument samplePdf = new PDDocument();
InputStream in1 = this.getClass().getResourceAsStream(DOC1_NAME);
PDDocument doc1 = PDDocument.load(in1);
List<PDPage> pages = (List<PDPage>) doc1.getDocumentCatalog().getAllPages();
// *** Change this loop to only copy the pages you want from DOC1
for (PDPage page : pages) {
for (int i = 0; i < COPIES; i++) { // loop for each additional copy
samplePdf.importPage(page);
}
}
// Same code again mostly
InputStream in2 = this.getClass().getResourceAsStream(DOC2_NAME);
PDDocument doc2 = PDDocument.load(in2);
pages = (List<PDPage>) doc2.getDocumentCatalog().getAllPages();
for (PDPage page : pages) {
for (int i = 0; i < COPIES; i++) { // loop for each additional copy
samplePdf.importPage(page);
}
}
// Then write the results out
File output = new File(OUT_NAME);
FileOutputStream out = new FileOutputStream(output);
samplePdf.save(out);
samplePDF.close();
in1.close();
doc1.close();
in2.close();
doc2.close();

ITextSharp adding text. Some text not showing up

I am adding text to an already created pdf document using this method.
ITextSharp insert text to an existing pdf
Basically it uses the PdfContentByte and then adds the content template to the page.
I am finding that in some areas of the file, the text doesn't show up.
It seems that the text I am adding is showing up behind the content that is already on the page? I flattened the pdf document down to it just being images but I am still having the same issue happen with the flattened file.
Has anyone had any issues adding text being hidden using Itextsharp?
I also tried using DirectContentUnder as was suggested in this link to no avail..
iTextSharp hides text when write
Here is the code I am using...With this I am trying to basically overlay graph paper on top of the PDF. In this example, there is a box in the upper left corner of every page that doesn't get populated. There is an image in the original pdf in this spot. And on the 4th and 5th pages, there are boxes that don't get populated, but they don't seem to be images.
PdfReader reader = new PdfReader(oldFile);
iTextSharp.text.Rectangle size = reader.GetPageSizeWithRotation(1);
Document document = new Document(size);
// open the writer
FileStream fs = new FileStream(newFile, FileMode.Create, FileAccess.Write);
PdfWriter writer = PdfWriter.GetInstance(document, fs);
document.Open();
// the pdf content
PdfContentByte cb = writer.DirectContent;
for (int i = 0; i < reader.NumberOfPages; i++)
{
document.NewPage();
// select the font properties
BaseFont bf = BaseFont.CreateFont(BaseFont.HELVETICA_BOLD, BaseFont.CP1252, BaseFont.NOT_EMBEDDED);
cb.SetFontAndSize(bf, 4);
cb.SetColorStroke(BaseColor.GREEN);
cb.SetLineWidth(1f);
for (int j = 10; j < 600; j += 10)
{
WriteToDoc(ref cb, j.ToString(), j, 10);//Write the line number
WriteToDoc(ref cb, j.ToString(), j, 780);//Write the line number
if (j % 20 == 0)
{
cb.MoveTo(j, 20);
cb.LineTo(j, 760);
cb.Stroke();
}
}
for (int j = 10; j < 800; j += 10)
{
WriteToDoc(ref cb, j.ToString(), 5, j);//Write the line number
WriteToDoc(ref cb, j.ToString(), 590, j);//Write the line number
if (j % 20 == 0)
{
cb.MoveTo(15, j);
cb.LineTo(575, j);
cb.Stroke();
}
}
// create the new page and add it to the pdf
PdfImportedPage page = writer.GetImportedPage(reader, i + 1);
cb.AddTemplate(page, 0, 0);
}
// close the streams and voilá the file should be changed :)
document.Close();
fs.Close();
writer.Close();
reader.Close();
Thanks for any of the help you can provide...I really appreciate it!
-Greg
First of all: If you are trying to basically overlay graph paper on top of the PDF, why do you first draw the graph paper and stamp the original page onto it? You essentially are underlaying graph paper, not overlaying it.
Depending on the content of the page, your graph paper this way may easily get covered. E.g. if there is a filled rectangle in the page content, in the result there is a box in the upper left corner of every page that doesn't get populated.
Thus, simply first add the old page content, then add overlay changes.
This being said, for the task of applying changes to an existing PDF, using PdfWriter and GetImportedPage is less than optimal. This actually is a task for the PdfStamper class which its made for stamping additional content on existing PDFs.
E.g. have a look at the sample StampText, the pivotal code being:
PdfReader reader = new PdfReader(resource);
using (var ms = new MemoryStream())
{
using (PdfStamper stamper = new PdfStamper(reader, ms))
{
PdfContentByte canvas = stamper.GetOverContent(1);
ColumnText.ShowTextAligned( canvas, Element.ALIGN_LEFT, new Phrase("Hello people!"), 36, 540, 0 );
}
return ms.ToArray();
}

Insert PDF in PDF (NOT merging files)

I'd like to insert a PDF page in another PDF page scaled. I'd like to use iTextSharp for this.
I have a vector drawing which can be exported as a single page PDF file. I would like to add this file into a page of other PDF document just like I would add an image to a PDF document.
Is this possible?
The purpose of this is to retain the ability to zoom in without losing quality.
It is very hard to reproduce the vector drawing using PDF vectors because it is an extremely complex drawing.
Exporting the vector drawing as high resolution image is not an option since I have to use a lot of them in a single PDF document. The final PDF would be very large and its writing too slow.
This is relatively easy to do although there's a couple of ways to go about it. If you're creating a new document that has the other documents inside of it and nothing else then the easiest thing to use is probably the PdfWriter.GetImportedPage(PdfReader, Int). This will give you a PdfImportedPage (which inherits from PdfTemplate). Once you have that you can add it to your new document by using PdfWriter.DirectContent.AddTemplate(PdfImportedPage, Matrix).
There's a couple of overloads to AddTemplate() but the easiest one (at least for me) is the one that takes a System.Drawing.Drawing2D.Matrix. If you use this you can easily scale and translate (change x,y) without having to think in "matrix" terms.
Below is sample code that shows this off. It targets iTextSharp 5.4.0 although it should work pretty much the same with 4.1.6 if you remove the using statements. It first creates a sample PDF with 12 pages with random background colors. Then it creates a second document and adds each page from the first PDF scaled by 50% so that 4 old pages fit onto 1 new page. See the code comments for further details. This code assumes that all pages are the same size, you might need to perform further calculations if your situation differs.
//Test files that we'll be creating
var file1 = Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Desktop), "File1.pdf");
var file2 = Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Desktop), "File2.pdf");
//For test purposes we'll fill the pages with a random background color
var R = new Random();
//Standard PDF creation, nothing special here
using (var fs = new FileStream(file1, FileMode.Create, FileAccess.Write, FileShare.None)) {
using (var doc = new Document()) {
using (var writer = PdfWriter.GetInstance(doc, fs)) {
doc.Open();
//Create 12 pages with text on each one
for (int i = 1; i <= 12; i++) {
doc.NewPage();
//For test purposes fill the page with a random background color
var cb = writer.DirectContentUnder;
cb.SaveState();
cb.SetColorFill(new BaseColor(R.Next(0, 256), R.Next(0, 256), R.Next(0, 256)));
cb.Rectangle(0, 0, doc.PageSize.Width, doc.PageSize.Height);
cb.Fill();
cb.RestoreState();
//Add some text to the page
doc.Add(new Paragraph("This is page " + i.ToString()));
}
doc.Close();
}
}
}
//Create our combined file
using (var fs = new FileStream(file2, FileMode.Create, FileAccess.Write, FileShare.None)) {
using (var doc = new Document()) {
using (var writer = PdfWriter.GetInstance(doc, fs)) {
//Bind a reader to the file that we created above
using (var reader = new PdfReader(file1)) {
doc.Open();
//Get the number of pages in the original file
int pageCount = reader.NumberOfPages;
//Loop through each page
for (int i = 0; i < pageCount; i++) {
//We're putting four original pages on one new page so add a new page every four pages
if (i % 4 == 0) {
doc.NewPage();
}
//Get a page from the reader (remember that PdfReader pages are one-based)
var imp = writer.GetImportedPage(reader, (i + 1));
//A transform matrix is an easier way of dealing with changing dimension and coordinates on an rectangle
var tm = new System.Drawing.Drawing2D.Matrix();
//Scale the image by half
tm.Scale(0.5f, 0.5f);
//PDF coordinates put 0,0 in the bottom left corner.
if (i % 4 == 0) {
tm.Translate(0, doc.PageSize.Height); //The first item on the page needs to be moved up "one square"
} else if (i % 4 == 1) {
tm.Translate(doc.PageSize.Width, doc.PageSize.Height); //The second needs to be moved up and over
} else if (i % 4 == 2) {
//Nothing needs to be done for the third
} else if (i % 4 == 3) {
tm.Translate(doc.PageSize.Width, 0); //The fourth needs to be moved over
}
//Add our imported page using the matrix that we set above
writer.DirectContent.AddTemplate(imp,tm);
}
doc.Close();
}
}
}
}
In addition; while i was trying to add a rotated pdf to a rotated pdf, i got some rotation problems. Kind of confusing but you should check the "PdfImportedPage.Rotation" of the page which is gonna be added to pdf.
PdfImportedPage page;//page = writer.GetImportedPage(PdfReader reader, int pageNum);
PdfContentByte pcb;//pcb = PdfWriter.DirectContentUnder;
//create matrix to use for rotating imported page
Matrix matrix = new Matrix(a, b, c, d, e, f);
matrix.Rotate(-(page.Rotation));
if (page.Rotation != 0)
pcb.AddTemplate(page, matrix, true);
else
pcb.AddTemplate(page, a, b, c, d, e, f, true);
code looks like silly but i want to get your attention on "matrix.Rotate(negative rotation of imported page)"