I have written some code that merges together multiple PDF's into a single PDF that I then display from the MemoryStream. This works great. What I need to do is add a table of contents to the end of the file with links to the start of each of the individual PDF's. I planned on doing this using the GotoLocalPage action which has an option for page numbers but it doesn't seem to work. If I change the action to the code below to one of the presset ones like PDFAction.FIRSTPAGE it works fine. Does this not work because I am using the PDFCopy object for the writer parameter of GotoLocalPage?
Document mergedDoc = new Document();
MemoryStream ms = new MemoryStream();
PdfCopy copy = new PdfCopy(mergedDoc, ms);
mergedDoc.Open();
MemoryStream tocMS = new MemoryStream();
Document tocDoc = null;
PdfWriter tocWriter = null;
for (int i = 0; i < filesToMerge.Length; i++)
{
string filename = filesToMerge[i];
PdfReader reader = new PdfReader(filename);
copy.AddDocument(reader);
// Initialise TOC document based off first file
if (i == 0)
{
tocDoc = new Document(reader.GetPageSizeWithRotation(1));
tocWriter = PdfWriter.GetInstance(tocDoc, tocMS);
tocDoc.Open();
}
// Create link for TOC, added random number of 3 for now
Chunk link = new Chunk(filename);
PdfAction action = PdfAction.GotoLocalPage(3, new PdfDestination(PdfDestination.FIT), copy);
link.SetAction(action);
tocDoc.Add(new Paragraph(link));
}
// Add TOC to end of merged PDF
tocDoc.Close();
PdfReader tocReader = new PdfReader(tocMS.ToArray());
copy.AddDocument(tocReader);
copy.Close();
displayPDF(ms.ToArray());
I guess an alternative would be to link to a named element (instead of page number) but I can't see how to add an 'invisible' element to the start of each file before adding to the merged document?
I would just go with two passes. In your first pass, do the merge as you are but also record the filename and page number it should link to. In your second pass, use a PdfStamper which will give you access to a ColumnText that you can use general abstractions like Paragraph in. Below is a sample that shows this off:
Since I don't have your documents, the below code creates 10 documents with a random number of pages each just for testing purposes. (You obviously don't need to do this part.) It also creates a simple dictionary with a fake file name as the key and the raw bytes from the PDF as a value. You have a true file collection to work with but you should be able to adapt that part.
//Create a bunch of files, nothing special here
//files will be a dictionary of names and the raw PDF bytes
Dictionary<string, byte[]> Files = new Dictionary<string, byte[]>();
var r = new Random();
for (var i = 1; i <= 10; i++) {
using (var ms = new MemoryStream()) {
using (var doc = new Document()) {
using (var writer = PdfWriter.GetInstance(doc, ms)) {
doc.Open();
//Create a random number of pages
for (var j = 1; j <= r.Next(1, 5); j++) {
doc.NewPage();
doc.Add(new Paragraph(String.Format("Hello from document {0} page {1}", i, j)));
}
doc.Close();
}
}
Files.Add("File " + i.ToString(), ms.ToArray());
}
}
This next block merges the PDFs. This is mostly the same as your code except that instead of writing a TOC here I'm just keeping track of what I want to write in the future. Where I'm using file.value you'd use your full file path and where I'm using file.key you'd use your file's name instead.
//Dictionary of file names (for display purposes) and their page numbers
var pages = new Dictionary<string, int>();
//PDFs start at page 1
var lastPageNumber = 1;
//Will hold the final merged PDF bytes
byte[] mergedBytes;
//Most everything else below is standard
using (var ms = new MemoryStream()) {
using (var document = new Document()) {
using (var writer = new PdfCopy(document, ms)) {
document.Open();
foreach (var file in Files) {
//Add the current page at the previous page number
pages.Add(file.Key, lastPageNumber);
using (var reader = new PdfReader(file.Value)) {
writer.AddDocument(reader);
//Increment our current page index
lastPageNumber += reader.NumberOfPages;
}
}
}
}
mergedBytes = ms.ToArray();
}
This last block actually writes the TOC. If we use a PdfStamper we can create a ColumnText which allows us to use Paragraphs
//Will hold the final PDF
byte[] finalBytes;
using (var ms = new MemoryStream()) {
using (var reader = new PdfReader(mergedBytes)) {
using (var stamper = new PdfStamper(reader, ms)) {
//The page number to insert our TOC into
var tocPageNum = reader.NumberOfPages + 1;
//Arbitrarily pick one page to use as the size of the PDF
//Additional logic could be added or this could just be set to something like PageSize.LETTER
var tocPageSize = reader.GetPageSize(1);
//Arbitrary margin for the page
var tocMargin = 20;
//Create our new page
stamper.InsertPage(tocPageNum, tocPageSize);
//Create a ColumnText object so that we can use abstractions like Paragraph
var ct = new ColumnText(stamper.GetOverContent(tocPageNum));
//Set the working area
ct.SetSimpleColumn(tocPageSize.GetLeft(tocMargin), tocPageSize.GetBottom(tocMargin), tocPageSize.GetRight(tocMargin), tocPageSize.GetTop(tocMargin));
//Loop through each page
foreach (var page in pages) {
var link = new Chunk(page.Key);
var action = PdfAction.GotoLocalPage(page.Value, new PdfDestination(PdfDestination.FIT), stamper.Writer);
link.SetAction(action);
ct.AddElement(new Paragraph(link));
}
ct.Go();
}
}
finalBytes = ms.ToArray();
}
Related
I have an XFA form that I can successfully fill in by extracting the XML modifying and writing back. Works great if you have the full Adobe Acrobat, but fails with Adobe Reader. I have seen various questions on the same thing with answers but they were some time ago so updating an XFA that is readable by Adobe Reader may no longer be doable?
I use this code below and I've utilised the StampingProperties of append as in the iText example but still failing. I'm using iText 7.1.15.
//open file and write to temp one
PdfDocument pdf = new(new PdfReader(FileToProcess), new PdfWriter(NewPDF), new StampingProperties().UseAppendMode());
PdfAcroForm form = PdfAcroForm.GetAcroForm(pdf, true);
XfaForm xfa = form.GetXfaForm();
XElement node = xfa.GetDatasetsNode();
IEnumerable<XNode> list = node.Nodes();
foreach (XNode item in list)
{
if (item is XElement element && "data".Equals(element.Name.LocalName))
{
node = element;
break;
}
}
XmlWriterSettings settings = new() { Indent = true };
using XmlWriter writer = XmlWriter.Create(XMLOutput, settings);
{
node.WriteTo(writer);
writer.Flush();
writer.Close();
}
//We now how to strip an extra xfa line if updating
if(update)
{
string TempXML= CSTrackerHelper.MakePath($"{AppContext.BaseDirectory}Temp", $"{Guid.NewGuid()}.XML");
StreamReader fsin = new(XMLOutput);
StreamWriter fsout = new(TempXML);
string linedata = string.Empty;
int cnt = 0;
while (!fsin.EndOfStream)
{
if (cnt != 3 && linedata != string.Empty)
{
fsout.WriteLine(linedata);
}
linedata = fsin.ReadLine();
cnt++;
}
fsout.Close();
fsin.Close();
XMLOutput = TempXML;
}
xlogger.Info("Populating pdf fields");
//Now loop through our field data and update the XML
XmlDocument xmldoc = new();
xmldoc.Load(XMLOutput);
XmlNamespaceManager xmlnsManager = new(xmldoc.NameTable);
xmlnsManager.AddNamespace("xfa", #"http://www.xfa.org/schema/xfa-data/1.0/");
string[] FieldValues;
string[] MultiNodes;
foreach (KeyValuePair<string, DocumentFieldData> v in DocumentData.FieldData)
{
if (!string.IsNullOrEmpty(v.Value.Field))
{
FieldValues = v.Value.Field.Contains(";") ? v.Value.Field.Split(';') : (new string[] { v.Value.Field });
foreach (string FValue in FieldValues)
{
XmlNodeList aNodes;
if (FValue.Contains("{"))
{
aNodes = xmldoc.SelectNodes(FValue.Substring(0, FValue.LastIndexOf("{")), xmlnsManager);
if (aNodes.Count > 1)
{
//We have a multinode
MultiNodes = FValue.Split('{');
int NodeIndex = int.Parse(MultiNodes[1].Replace("}", ""));
aNodes[NodeIndex].InnerText = v.Value.Data;
}
}
else
{
aNodes = xmldoc.SelectNodes(FValue, xmlnsManager);
if (aNodes.Count >= 1)
{
aNodes[0].InnerText = v.Value.Data;
}
}
}
}
}
xmldoc.Save(XMLOutput);
//Now we've updated the XML apply it to the pdf
xfa.FillXfaForm(new FileStream(XMLOutput, FileMode.Open, FileAccess.Read));
xfa.Write(pdf);
pdf.Close();
FYI I've also tried to set a field directly also with the same results.
PdfReader preader = new PdfReader(source);
PdfDocument pdfDoc=new PdfDocument(preader, new PdfWriter(dest), new StampingProperties().UseAppendMode());
PdfAcroForm pdfForm = PdfAcroForm.GetAcroForm(pdfDoc, true);
XfaForm xform = pdfForm.GetXfaForm();
xform.SetXfaFieldValue("VRM[0].CoverPage[0].Wrap2[0].Table[0].CSID[0]", "Test");
xform.Write(pdfForm);
pdfDoc.Close();
If anyone has any ideas it would be appreciated.
Cheers
I ran into a very similar issue. I was attempting to auto fill an XFA that was password protected while not breaking the certificate or usage rights (it allowed filling). iText7 seems to have made this not possible for legal/practical reasons, however it is still very much possible with iText5. I wrote the following working codeusing iTextSharp (C# version if iText5):
using iTextSharp.text;
using iTextSharp.text.pdf;
string pathToRead = "/Users/home/Desktop/c#pdfParser/encrypted_empty.pdf";
string pathToSave = "/Users/home/Desktop/c#pdfParser/xfa_encrypted_filled.pdf";
string data = "/Users/home/Desktop/c#pdfParser/sample_data.xml";
FillByItextSharp5(pathToRead, pathToSave, data);
static void FillByItextSharp5(string pathToRead, string pathToSave, string data)
{
using (FileStream pdf = new FileStream(pathToRead, FileMode.Open))
using (FileStream xml = new FileStream(data, FileMode.Open))
using (FileStream filledPdf = new FileStream(pathToSave, FileMode.Create))
{
PdfReader.unethicalreading = true;
PdfReader pdfReader = new PdfReader(pdf);
PdfStamper stamper = new PdfStamper(pdfReader, filledPdf, '\0', true);
stamper.AcroFields.Xfa.FillXfaForm(xml, true);
stamper.Close();
pdfReader.Close();
}
}
PdfStamper stamper = new PdfStamper(pdfReader, filledPdf, '\0', true)
you have to use this line.
I want to add a functionality of adding a watermark using itextSharp library to the pdf document that is being added to the library. For this I created an event listener that is triggered when item is being added. The code is as follows :
using System;
using System.Security.Permissions;
using Microsoft.SharePoint;
using Microsoft.SharePoint.Utilities;
using Microsoft.SharePoint.Workflow;
using iTextSharp.text;
using iTextSharp.text.pdf;
using System.IO;
namespace ProjectPrac.WaterMarkOnUpload
{
/// <summary>
/// List Item Events
/// </summary>
public class WaterMarkOnUpload : SPItemEventReceiver
{
/// <summary>
/// An item is being added.
/// </summary>
public override void ItemAdding(SPItemEventProperties properties)
{
base.ItemAdding(properties);
string watermarkedFile = "Watermarked.pdf";
// Creating watermark on a separate layer
// Creating iTextSharp.text.pdf.PdfReader object to read the Existing PDF Document
PdfReader reader1 = new PdfReader("C:\\Users\\Desktop\\Hello.pdf"); //THE RELATIVE PATH
using (FileStream fs = new FileStream(watermarkedFile, FileMode.Create, FileAccess.Write, FileShare.None))
// Creating iTextSharp.text.pdf.PdfStamper object to write Data from iTextSharp.text.pdf.PdfReader object to FileStream object
using (PdfStamper stamper = new PdfStamper(reader1, fs))
{
// Getting total number of pages of the Existing Document
int pageCount = reader1.NumberOfPages;
// Create New Layer for Watermark
PdfLayer layer = new PdfLayer("WatermarkLayer", stamper.Writer);
// Loop through each Page
for (int i = 1; i <= pageCount; i++)
{
// Getting the Page Size
Rectangle rect = reader1.GetPageSize(i);
// Get the ContentByte object
PdfContentByte cb = stamper.GetUnderContent(i);
// Tell the cb that the next commands should be "bound" to this new layer
cb.BeginLayer(layer);
cb.SetFontAndSize(BaseFont.CreateFont(
BaseFont.HELVETICA, BaseFont.CP1252, BaseFont.NOT_EMBEDDED), 50);
PdfGState gState = new PdfGState();
gState.FillOpacity = 0.25f;
cb.SetGState(gState);
cb.SetColorFill(BaseColor.BLACK);
cb.BeginText();
cb.ShowTextAligned(PdfContentByte.ALIGN_CENTER, "Confidential", rect.Width / 2, rect.Height / 2, 45f);
cb.EndText();
// Close the layer
cb.EndLayer();
}
}
}
I want to know how to add the path without hardcoding it here :
PdfReader reader1 = new PdfReader("C:\\Users\\Desktop\\Hello.pdf"); //THE RELATIVE PATH
And then uploading the watermarked document to the library and not the original pdf.
I know that it can also be done through workflow but I am pretty new to sharepoint. So if at all you have an answer that has workflow in it please give the link that explains the workflow for automating the pdf watermarking.
You don't need to have workflow to achieve what you are looking for:
First, use ItemAdded event instead of ItemAdding. Then you can access SPFile associated with updated list item.
public override void ItemAdded(SPItemEventProperties properties)
{
var password = string.Empty; //or you put some password handling
SPListItem listItemToFile = properties.Listitem;
SPFile pdfOriginalFile = listItemToFile.File;
//get byte[] of uploaded file
byte[] contentPdfOriginalFile = pdfOriginalFile.OpenBinary();
//create reader from byte[]
var pdfReader = new PdfReader(new RandomAccessFileOrArray(contentPdfOriginalFile), password);
using (var ms = new MemoryStream()) {
using (var stamper = new PdfStamper(pdfReader, ms, '\0', true)) {
// do your watermarking stuff
...
// resuming SP stuff
}
var watermarkedPdfContent = ms.ToArray();
base.EventFiringEnabled = false; //to prevent other events being fired
var folder = pdfOriginalFile.ParentFolder;//you want to upload to the same place
folder.Files.Add(contentPdfOriginalFile.Name, fs.ToArray(),true);
base.EventFiringEnabled = true;
}
}
I probably did a typo or two since I didn't run this code. However, it should give you an idea.
I am using itextsharp in ASP.NET. We populate a PDF with fields that are taken from one of our online forms. I need to change the way we handle the documents - we need to be able to use some of the fields as the name of the document(firstname-lastname.pdf), and to save that PDF into a directory. Here is the code I am using now:
PdfStamper ps = null;
DataTable dt = BindData();
if (dt.Rows.Count > 0)
{
PdfReader r = new PdfReader(new RandomAccessFileOrArray("http://www.example.com/Documents/ppd-certificate.pdf"), null);
ps = new PdfStamper(r, Response.OutputStream);
AcroFields af = ps.AcroFields;
af.SetField("fullName", dt.Rows[0]["fullName"].ToString());
af.SetField("presentationTitle", dt.Rows[0]["presentationTitle"].ToString());
af.SetField("presenterName", dt.Rows[0]["presenterFullName"].ToString());
af.SetField("date", Convert.ToDateTime(dt.Rows[0]["date"]).ToString("MM/dd/yyyy"));
ps.FormFlattening = true;
ps.Close();
}
PdfStamper and PdfWriter both use the generic Stream class so instead of Response.OutputStream you can use a FileStream or a MemoryStream
This example writes directly to disk. Set testFile to whatever you want, I'm using the desktop here
//Your file path here:
var testFile = System.IO.Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Desktop), "test.pdf");
using (var fs = new FileStream(testFile, FileMode.Create, FileAccess.Write, FileShare.None)) {
PdfReader r = new PdfReader(new RandomAccessFileOrArray("http://www.example.com/Documents/ppd-certificate.pdf"), null);
var ps = new PdfStamper(r, fs);
//..code
}
This next example is my preferred method. It creates a MemoryStream, then creates a PDF inside of it and finally grabs the raw bytes. Once you've got raw bytes you can both write them to disk AND Response.BinaryWrite() then.
byte[] bytes;
using (var ms = new MemoryStream()) {
PdfReader r = new PdfReader(new RandomAccessFileOrArray("http://www.example.com/Documents/ppd-certificate.pdf"), null);
var ps = new PdfStamper(r, ms);
//..code
bytes = ms.ToArray();
}
//Your file path here:
var testFile = System.IO.Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Desktop), "test.pdf");
//Write to disk
System.IO.File.WriteAllBytes(testFile, bytes);
//Send to HTTP client
Response.BinaryWrite(bytes);
I have listbox on PDF.
User will select multiple options from listbox.
I need to upload PDF to database.
I am unable to retrieve selected indices from listbox using iTextSharp?
I tried with
SetListSelection("listbox", PreviousExport.ToArray) but no luck.
How to retrieve user selected indices from listbox on PDF using itextsharp?
Code from comments:
I am using below code to load listbox .. This is from database
form.SetListOption("ddlNoteStatus", strbuilderExport.ToArray, strbuilderDisplay.ToArray)
stamper.AcroFields.SetField("ddlNoteStatus", "3")
I am able to retrieve other fields from pdf which are not of listbox with below code. But if I use same code for list box only last value selected from list box shows but not all values selected by user
stamper.AcroFields.GetField("txtDateFollow")
Instead of GetField you want to use GetListSelection. To be safe, you might want to always call GetFieldType to determine the type of field that you're looking at. The below code shows this off:
using (var r = new PdfReader(testFile)) {
var acro = r.AcroFields;
if(acro.GetFieldType("countries") == AcroFields.FIELD_TYPE_LIST ){
Console.WriteLine(String.Join(",", acro.GetListSelection("countries").ToArray()));
}
}
I tested the above code against a PDF that I created using the below code:
var testFile = Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Desktop), "test.pdf");
using (var fs = new FileStream(testFile, FileMode.Create, FileAccess.Write, FileShare.None)) {
using (var doc = new Document()) {
using (var writer = PdfWriter.GetInstance(doc, fs)) {
doc.Open();
doc.Add(new Paragraph("Hello World"));
var dd = new iTextSharp.text.pdf.TextField(writer, new iTextSharp.text.Rectangle(50, 500, 200, 550), "countries");
dd.Choices = new string[] { "United States", "Canada", "France" };
dd.ChoiceExports = new string[] { "US", "CA", "FR" };
dd.Options = dd.Options | TextField.MULTISELECT;
dd.ChoiceSelections = new List<int>(new int[] { 0, 2 });
writer.AddAnnotation(dd.GetListField());
doc.Close();
}
}
}
How to append pages to one pdf file from another pdf file without creating a new pdf using itextsharp. I have metadata attached to one pdf so i just want to add only the other pdf pages,so that first pdf metadata should remain as it is.
Regards
Himvj
Assuming you have 2 pdf files: file1.pdf and file2.pdf that you want to concatenate and save the resulting pdf to file1.pdf (by replacing its contents) you could try the following:
using (var output = new MemoryStream())
{
var document = new Document();
var writer = new PdfCopy(document, output);
document.Open();
foreach (var file in new[] { "file1.pdf", "file2.pdf" })
{
var reader = new PdfReader(file);
int n = reader.NumberOfPages;
PdfImportedPage page;
for (int p = 1; p <= n; p++)
{
page = writer.GetImportedPage(reader, p);
writer.AddPage(page);
}
}
document.Close();
File.WriteAllBytes("file1.pdf", output.ToArray());
}
You can try this it add the whole document with metadata
public static void MergeFiles(string destinationFile, string[] sourceFiles)
{
try
{
//1: Create the MemoryStream for the destination document.
using (MemoryStream ms = new MemoryStream())
{
//2: Create the PdfCopyFields object.
PdfCopyFields copy = new PdfCopyFields(ms);
// - Set the security and other settings for the destination file.
//copy.Writer.SetEncryption(PdfWriter.STRENGTH128BITS, null, "1234", PdfWriter.AllowPrinting | PdfWriter.AllowCopy | PdfWriter.AllowFillIn);
copy.Writer.ViewerPreferences = PdfWriter.PageModeUseOutlines;
// - Create an arraylist to hold bookmarks for later use.
ArrayList outlines = new ArrayList();
int pageOffset = 0;
int f = 0;
//3: Import the documents specified in args[1], args[2], etc...
while (f < sourceFiles.Length)
{
// Grab the file from args[] and open it with PdfReader.
string file = sourceFiles[f];
PdfReader reader = new PdfReader(file);
// Import the pages from the current file.
copy.AddDocument(reader);
// Create an ArrayList of bookmarks in the file being imported.
// ArrayList bookmarkLst = SimpleBookmark.GetBookmark(reader);
// Shift the pages to accomidate any pages that were imported before the current document.
// SimpleBookmark.ShiftPageNumbers(bookmarkLst, pageOffset, null);
// Fill the outlines ArrayList with each bookmark as a HashTable.
// foreach (Hashtable ht in bookmarkLst)
// {
// outlines.Add(ht);
// }
// Set the page offset to the last page imported.
//copy.Writer.SetPageSize(rec);
pageOffset += reader.NumberOfPages;
f++;
}
//4: Put the outlines from all documents under a new "Root" outline and
// set them for destination document
// copy.Writer.Outlines = GetBookmarks("Root", ((Hashtable)outlines[0])["Page"], outlines);
//5: Close the PdfCopyFields object.
copy.Close();
//6: Save the MemoryStream to a file.
MemoryStreamToFile(ms, destinationFile);
}
}
catch (System.Exception e)
{
System.Console.Error.WriteLine(e.Message);
System.Console.Error.WriteLine(e.StackTrace);
System.Console.ReadLine();
}
}
public static void MemoryStreamToFile(MemoryStream MS, string FileName)
{
using (FileStream fs = new FileStream(#FileName, FileMode.Create))
{
byte[] data = MS.ToArray();
fs.Write(data, 0, data.Length);
fs.Close();
}
}