Get and set metadata for itext pdf document - pdf

I have an iText Document object and I want to write some metadata into it or read from it.
How can I do that?
Imagine that the document is beeing passed to a method like :
public void prePreccess(Object document) {
Document pdfDocument = ((Document) document);
//What to do here with pdfDocument?
}

Do you want to populate the info dictionary of a PDF? That's explained in the MetadataPdf example:
// step 1
Document document = new Document();
// step 2
PdfWriter.getInstance(document, new FileOutputStream(filename));
// step 3
document.addTitle("Hello World example");
document.addAuthor("Bruno Lowagie");
document.addSubject("This example shows how to add metadata");
document.addKeywords("Metadata, iText, PDF");
document.addCreator("My program using iText");
document.open();
// step 4
document.add(new Paragraph("Hello World"));
// step 5
document.close();
Do you want to set the XMP metadata? This is explained in the MetadataXmp example:
// step 1
Document document = new Document();
// step 2
PdfWriter writer = PdfWriter.getInstance(document, new FileOutputStream(RESULT1));
ByteArrayOutputStream os = new ByteArrayOutputStream();
XmpWriter xmp = new XmpWriter(os);
XmpSchema dc = new com.itextpdf.text.xml.xmp.DublinCoreSchema();
XmpArray subject = new XmpArray(XmpArray.UNORDERED);
subject.add("Hello World");
subject.add("XMP & Metadata");
subject.add("Metadata");
dc.setProperty(DublinCoreSchema.SUBJECT, subject);
xmp.addRdfDescription(dc);
PdfSchema pdf = new PdfSchema();
pdf.setProperty(PdfSchema.KEYWORDS, "Hello World, XMP, Metadata");
pdf.setProperty(PdfSchema.VERSION, "1.4");
xmp.addRdfDescription(pdf);
xmp.close();
writer.setXmpMetadata(os.toByteArray());
// step 3
document.open();
// step 4
document.add(new Paragraph("Hello World"));
// step 5
document.close();
Note that this method is deprecated: we have replaced the XMP functionality recently, but we still have to write some examples using the new code.
Maybe you want to set populate the info dictionary and create the XMP metadata at the same time:
// step 1
Document document = new Document();
// step 2
PdfWriter writer = PdfWriter.getInstance(document, new FileOutputStream(filename));
document.addTitle("Hello World example");
document.addSubject("This example shows how to add metadata & XMP");
document.addKeywords("Metadata, iText, step 3");
document.addCreator("My program using 'iText'");
document.addAuthor("Bruno Lowagie");
writer.createXmpMetadata();
// step 3
document.open();
// step 4
document.add(new Paragraph("Hello World"));
// step 5
document.close();
If I were you, I'd use this option because it's the most complete solution.
You should not read the metadata from a Document object.
You can read the XMP stream from an existing PDF like this:
public void readXmpMetadata(String src, String dest) throws IOException {
PdfReader reader = new PdfReader(src);
FileOutputStream fos = new FileOutputStream(dest);
byte[] b = reader.getMetadata();
fos.write(b, 0, b.length);
fos.flush();
fos.close();
reader.close();
}
You can read the entries in the info dictionary like this:
PdfReader reader = new PdfReader(src);
PdfStamper stamper = new PdfStamper(reader, new FileOutputStream(dest));
Map<String, String> info = reader.getInfo();
The info object will contain a series of keys and values that are stored as metadata inside the PDF.

Related

/AF reference to file embedded into a PDF with iTextSharp

I am struggling to include into the PDF Catalog an "/AF" reference to an Embedded File object just added by the iTextSharp library. The following code works, but the string literal PdfLiteral("[2 0 R]") is unbearable, because the reference to an embedded file may of course have any other number than "2 0":
PdfReader reader = new PdfReader("C:\\ZUGFeRD\\zugferd_2p1_EN16931_Einfach_stripped.pdf");
FileStream outputstream = new FileStream("C:\\ZUGFeRD\\zugferd_2p1_EN16931_Einfach2.pdf", FileMode.Create);
PdfStamper stamp = new PdfStamper(reader, outputstream);
PdfWriter writer = stamp.Writer;
byte[] bytes = System.IO.File.ReadAllBytes("C:\\ZUGFeRD\\factur-x.xml");
// Sample spec: Desc(factur-x.xml) / EF <</ UF 4 0 R / F 4 0 R >>/ Type / Filespec / AFRelationship / Alternative / F(factur-x.xml) / UF(factur - x.xml)
PdfFileSpecification fs = PdfFileSpecification.FileEmbedded(writer, "", "factur-x.xml", bytes, false, "text/xml", null);
fs.Put(PdfName.AFRELATIONSHIP, new PdfName("Source"));
stamp.AddFileAttachment("factur-x.xml", fs);
stamp.Writer.ExtraCatalog.Put(PdfName.AF, new PdfLiteral("[2 0 R]"));
stamp.Close();
How can I write it better?

How to extract a portion of a page and write to a new PDF file in itext7?

I want to divide a PDF page in to 4 quadrants. Then write each quadrant in to separate PDF page (or a document). I don't want to crop the existing page, but extract the contents of each quadrant and write it in to a new PDF file. Is there a way to do this using itext7?
I want to mention that the documentation for itextsharp and itext7 is bad and lacking in many ways - the book "iText in Action 2nd Edition" is the only help, if you are willing to read a book, and the examples are only in Java and some of the code is implemented in a different way in C#, not to mention that this is only on itextsharp 5.
For future reference - assuming you need equal parts split, here is what will do the trick( this is for 4x4 - that is 16 parts):
public void manipulatePdf(string src, string dest)
{
PdfReader reader = new PdfReader(src);
iTextSharp.text.Rectangle pagesize = reader.GetPageSizeWithRotation(1);
Document document = new Document(pagesize);
PdfWriter writer = PdfWriter.GetInstance(document,
new FileStream(dest, FileMode.OpenOrCreate, FileAccess.ReadWrite, FileShare.ReadWrite));
document.Open();
PdfContentByte content = writer.DirectContent;
PdfImportedPage page = writer.GetImportedPage(reader, 1);
float x, y;
for (int i = 0; i< 16; i++)
{
x = -pagesize.Width * (i % 4);
y = pagesize.Height * (i / 4 - 3);
content.AddTemplate(page, 4, 0, 0, 4, x, y);
document.NewPage();
}
document.Close();
}

Adding external pdf content generated from SVG using apache batik to Source PDF using Itext with header and footer

I have the below requirement.
Convert the SVG to PDF using Apache batik
Prepare source PDF document with Header and Footer using IText 7
Take the converted PDF and embed it in the content of the source PDF
I have seen IText supports converting an SVG to Image but the output is not proper. The output from batik seems to be perfect.
Below is my code. Can anyone please suggest a proper approach ? I am not able to achieve it
SVG to PDF using batik
FileInputStream inputStream = new FileInputStream(new File(Paths.get("Input").toAbsolutePath()+"/test.svg"));
byte[] bytes = IOUtils.toByteArray(inputStream);
FileOutputStream pdfOutputStream = new FileOutputStream(new File(Paths.get("Output").toAbsolutePath()+"/convertedSvg.pdf"));
Transcoder transcoder = new PDFTranscoder();
TranscoderInput transcoderInput = new TranscoderInput(new ByteArrayInputStream(bytes));
TranscoderOutput transcoderOutput = new TranscoderOutput(pdfOutputStream);
int dpi = 300;
transcoder.addTranscodingHint(PDFTranscoder.KEY_WIDTH, new Float(dpi * 29.7));
transcoder.addTranscodingHint(PDFTranscoder.KEY_HEIGHT, new Float(dpi * 42.0));
transcoder.addTranscodingHint(PDFTranscoder.KEY_PIXEL_UNIT_TO_MILLIMETER,(25.4f / 72f));
transcoder.transcode(transcoderInput, transcoderOutput);
iText Code
PdfWriter writer = new PdfWriter(new FileOutputStream(new File(Paths.get("Output").toAbsolutePath()+"/final.pdf")));
PdfDocument pdfDoc = new PdfDocument(writer);
pdfDoc.setDefaultPageSize(PageSize.A3.rotate());
NormalPageHeader headerHandler = new NormalPageHeader(Paths.get("images").toAbsolutePath() + "\\logo.png", pdfFontMap);
pdfDoc.addEventHandler(PdfDocumentEvent.START_PAGE, headerHandler);
PageEndEvent pageEndEvent = new PageEndEvent(Paths.get("images").toAbsolutePath() + "\\FooterLineExternal.png" ,pdfFontMap);
pdfDoc.addEventHandler(PdfDocumentEvent.END_PAGE, pageEndEvent);
Document doc = new Document(pdfDoc);
doc.getPageEffectiveArea(PageSize.A3.rotate());
Table imageTable = new Table(1);
imageTable.setBorder(Border.NO_BORDER);
imageTable.setWidth(UnitValue.createPercentValue(100));
Cell cell = new Cell();
Paragraph paragraph = new Paragraph("Horizontal Trajectory");
paragraph.setVerticalAlignment(VerticalAlignment.TOP);
cell.add(paragraph);
cell.setBorder(Border.NO_BORDER);
cell.setPaddingTop(50);
imageTable.addCell(cell);
doc.add(imageTable);
doc.close();

Multi line form filling with iTextSharp

I'm tring to fill a PDF file that has some different form fields, one of those field is a multi line field. I load the values from a txt file (I have to repeat this operation in batch so I've a single txt file for each pdf I've to fill)
My problem is with the \r\n present in the txt file... if I pass on each formfield it writes something as "My text\r\n on new line").
I've tried to use a stringbuilder and .AppendLine method and if I save it's content it correctly shows
My text
on new line
my input file is defined as
<map>
<campoModulo>lo_some_data</campoModulo>
<value>My text\r\non new line</value>
</map>
and here's the code I use to fill the pdf
internal void Process(string p)
{
string output = p.Replace(".pdf", "_new.pdf");
PdfReader reader = new PdfReader(p);
XmlDocument document = new XmlDocument();
document.Load("XML\\mapping.xml");
var items = document.SelectNodes("//map");
PdfStamper pdfStamper = new PdfStamper(reader, new FileStream(
output, FileMode.Create));
var acro = pdfStamper.AcroFields;
if (items.Count > 0)
{
foreach (XmlNode item in items)
{
var campo = item.SelectSingleNode("campoModulo").InnerText;
var valore = item.SelectSingleNode("value").InnerText;
acro.SetField(campo, valore);
}
}
pdfStamper.FormFlattening = true;
pdfStamper.Close();
}
What am I doing wrong?
You have to replace the \r\n for the actual control codes. Try:
text = text.Replace(#"\r\n", "\r\n");

In iText, is it possible to preserve the reading-order of XFA PDF, as stored in the <traversal> and <traverse> elements, when flattening to PDF/A 1a?

How can I preserve reading-order when using iText to fill and flatten an XFA PDF creating a PDF/A 1a?
The reading-order of the pre-flattened PDF is stored in the and elements, but this information is not being expressed in the flattened tagged PDF/A 1a document.
Specifically, my two-column form, which is read properly by JAWS prior to flattening, is incorrectly read left-to-right from top to bottom all the way across both columns instead of being reading down one column and then proceeding to the beginning of the second column.
The iText Java code looks like this:
public void manipulatePdf(String src, String xml, File dest)
throws IOException, DocumentException, InterruptedException {
ByteArrayOutputStream baos = new ByteArrayOutputStream();
PdfReader reader = new PdfReader(src);
PdfStamper stamper = new PdfStamper(reader, baos);
AcroFields form = stamper.getAcroFields();
XfaForm xfa = form.getXfa();
stamper.close();
Document document = new Document();
PdfAWriter writer = PdfAWriter.getInstance(document,
new FileOutputStream(dest), PdfAConformanceLevel.PDF_A_1A);
writer.setTagged();
document.addLanguage("en-us");
document.open();
ICC_Profile iccProfile = ICC_Profile.getInstance(new FileInputStream(
colorProfile));
writer.setOutputIntents("Custom", "", "http://www.color.org",
"sRGB IEC61966-2.1", iccProfile);
PdfICCBased iccBased = new PdfICCBased(iccProfile);
iccBased.remove(PdfName.ALTERNATE);
PdfDictionary outputIntent = new PdfDictionary(PdfName.OUTPUTINTENT);
outputIntent.put(PdfName.OUTPUTCONDITIONIDENTIFIER, new PdfString(
"sRGB IEC61966-2.1"));
outputIntent.put(PdfName.INFO, new PdfString("sRGB IEC61966-2.1"));
outputIntent.put(PdfName.S, PdfName.GTS_PDFA1);
outputIntent.put(PdfName.DESTOUTPUTPROFILE, writer.addToBody(iccBased)
.getIndirectReference());
writer.getExtraCatalog().put(PdfName.OUTPUTINTENTS,
new PdfArray(outputIntent));
PdfDictionary markInfo = new PdfDictionary(PdfName.MARKINFO);
markInfo.put(PdfName.MARKED, new PdfBoolean(true));
writer.getExtraCatalog().put(PdfName.MARKINFO, markInfo);
XFAFlattener xfaf = new XFAFlattener(document, writer);
xfaf.flatten(new PdfReader(baos.toByteArray()), false);
document.close();
System.out.println("The form is flattened");
}