Remove acroform in pdf - pdfbox

Im using removeField to remove field of a document but how can I remove completly acroform in pdf?
Im aware of
acroform.flatten()
But i wonder if this is the correct method to remove all acroform ? Is there better way of doing this making maybe pdf smaller in size? Or remove acroform faster?

Call PDDocument.getDocumentCatalog().setAcroForm(null) and also remove all widget annotations from each page:
List<PDAnnotation> annotations = page.getAnnotations();
List<PDAnnotation> newList = new ArrayList<>();
for (PDAnnotation ann : annotations)
{
if (!(ann instanceof PDAnnotationWidget))
{
newList.add(ann);
}
}
if (newList.isEmpty())
{
page.setAnnotations(null);
}
else
{
page.setAnnotations(newList);
}

Related

Adding an Annotation to a PdfFormXObject so the Annotation is reusable

I'm using iText 7 to construct reusable PDF components that I reuse across multiple pages within a document. I'm using iText-dotnet for this task (v7), using F# as the language. (This shouldn't be hard to follow for non-F# people as it's just iText calls :D)
I know how to add annotations to a Page, that isn't the issue. Adding the annotation to the page is as simple as page.AddAnnotation(newAnnotation).
Where I'm having difficulty, is that there is no "Page" associated with a Canvas when you are using a PdfFormXObject() to render a Pdf fragment.
let template = new PdfFormXObject(rect)
let templateCanvas = PdfCanvas(template, pageContext.Canvas.GetPdfDocument())
let newCanvas = new Canvas(templateCanvas, rect)
Once I have the new Canvas, I try to write to the Canvas and add the Annotation via Page.AddAnnotation(). The problem is that there is no Page attached to the PdfFormXObject!
// Create the destination and annotation (destPage is the pageNumber)
let dest = PdfExplicitDestination.CreateFitB(destPage)
let action = PdfAction.CreateGoTo(dest)
let annotation = PdfLinkAnnotation(rect)
let border = iText.Kernel.Pdf.PdfAnnotationBorder(0f, 0f, 0f)
// set up the Annotation with action and display information
annotation
.SetHighlightMode(PdfAnnotation.HIGHLIGHT_PUSH)
.SetAction(action)
.SetBorder(border)
|> ignore
// Try adding the annotation to the page BOOM! (There is *NO* page (null) associated with newCanvas)
newCanvas.GetPage().AddAnnotation(annotation) |> ignore // HELP HERE: Is there another way to do this?
The issue is that I do not know of a different way to set the Annotation on the canvas. Is there a way to render the annotation and just add the annotation directly to the canvas as raw PDF instructions?
Alternatively, is there a way create a different reusable PDF fragment in iText so I can also reuse the GoTo annotation.
N.B. I could split off the annotations and then apply them every time I use the PdfFormXObject() on a new page, but that sort of defeats the purpose of reusing Pdf fragments (template) in my final PDF to reduce it's size.
If you can point me in the right direction, that would be great.
Again, this is not how to add an annotation to a Page(), that's easy. It's how to add an annotation to a PdfFormXObject (or similar mechanism that I'm unaware of for constructing rusable Pdf fragments).
-- As per John's comments below:
I cannot seem to find any reference to single use annotations.
I'm aware of the following example link, so I modified it to look like this:
private static void Main(string[] args)
{
try
{
PdfDocument pdfDocument = new PdfDocument(new PdfWriter("TestMultiLink.pdf"));
Document document = new Document(pdfDocument);
string destinationName = "MyForwardDestination";
// Create a PdfStringDestination to use more than once.
var stringDestination = new PdfStringDestination(destinationName);
for (int page = 1; page <= 50; page++)
{
document.Add(new Paragraph().SetFontSize(100).Add($"{page}"));
switch (page)
{
case 1: // First use of PdfStringDestination
document.Add(new Paragraph(new Link("Click here for a forward jump", stringDestination).SetFontSize(20)));
break;
case 3: // Re-use the stringDestination
document.Add(new Paragraph(new Link("Click here for a forward jump", stringDestination).SetFontSize(10)));
break;
case 42:
pdfDocument.AddNamedDestination(destinationName, PdfExplicitDestination.CreateFit(pdfDocument.GetLastPage()).GetPdfObject());
break;
}
if (page < 50)
document.Add(new AreaBreak(AreaBreakType.NEXT_PAGE));
}
document.Close();
}
catch (Exception e)
{
Console.WriteLine($"Ouch: {e.Message}");
}
}
If you dig into the iText source for iText.Layout.Link, you'll see that the String Destination is added as an Annotation. Therefore, I'm not sure if John's answer is true anymore.
Does anyone know how I can convert the Annotation to a Dictionary and how I would go about adding the PdfDictionary (raw) info into the PftFormXObject?
Thanks
#johnwhitington is correct.
Per PDF specification, annotations can only be added to a page, they cannot be added to a form XObject. It is not a limitation of iText or any other PDF library.
Annotations cannot be reused, each annotation is a distinct object.

Render XHTML on one page to PDF

I have XHTML containing shop receipt. I am trying to generate PDF out of it. Generation is not problem at all. But I would like to have "break-less" page (whole content fits to one page).
I have Koltin Spring project and using flying-saucer.
org.xhtmlrenderer:flying-saucer-pdf-itext5:9.1.22
Code is simple as this:
fun generatePDF(templateString: String): ByteArrayOutputStream {
val renderer = ITextRenderer()
renderer.sharedContext.also {
it.isPrint = true
it.isInteractive = false
it.textRenderer.setSmoothingThreshold(0F)
}
renderer.setDocumentFromString(templateString, baseUrl)
renderer.layout()
val baos = ByteArrayOutputStream()
renderer.createPDF(baos)
renderer.finishPDF()
return baos
}
Is it somehow possible to do it?
Note: I've found some information about page size in the documentation, but I am not sure how to use it if I don't know the exact size (items in receipt are calculated).

How to add watermark to a landscape file using pdfbox

I'm using pdfbox 1.8.11 and FOP to add water mark to pdf:s. It works nicely to most input pdf files.
However I get a problem when the file is in landscape, the watermarking will be 90 degree right rotated.
I had similar problem with visible signature, it is fixed. thanks to the solution in sign landscape file . Any idea how to make water mark rotation works? Thanks in advance!
The original picture for watermark is:
Up arrow
After FOP watermark the image is rotated:
image rotated
apologize for answer late.
The idea for 'water mark' here to add add some transforms into the original pdf using fop apache fop. You can fine java code example and fo template example from apache fop website.
In any case i will illustrate the example here too:
1. the java code of how to use fop
import org.apache.fop.apps.*;
import org.xml.sax.*;
import java.io.*;
import javax.xml.transform.*;
import javax.xml.transform.sax.*;
import javax.xml.transform.stream.*;
class rendtest {
private static FopFactory fopFactory = FopFactory.newInstance(new File(".").toURI());
private static TransformerFactory tFactory = TransformerFactory.newInstance();
public static void main(String args[]) {
OutputStream out;
try {
//Load the stylesheet
Templates templates = tFactory.newTemplates(
new StreamSource(new File(args[1])));
//First run (to /dev/null)
out = new org.apache.commons.io.output.NullOutputStream();
FOUserAgent foUserAgent = fopFactory.newFOUserAgent();
Fop fop = fopFactory.newFop(MimeConstants.MIME_PDF, foUserAgent, out);
Transformer transformer = templates.newTransformer();
transformer.setParameter("page-count", "#");
transformer.transform(new StreamSource(new File(args[0])),
new SAXResult(fop.getDefaultHandler()));
//Get total page count
String pageCount = Integer.toString(driver.getResults().getPageCount());
//Second run (the real thing)
out = new java.io.FileOutputStream(args[2]);
out = new java.io.BufferedOutputStream(out);
try {
foUserAgent = fopFactory.newFOUserAgent();
fop = fopFactory.newFop(MimeConstants.MIME_PDF, foUserAgent, out);
transformer = templates.newTransformer();
transformer.setParameter("page-count", pageCount);
transformer.transform(new StreamSource(new File(args[0])),
new SAXResult(fop.getDefaultHandler()));
} finally {
out.close();
}
} catch (Exception e) {
e.printStackTrace();
}
}
}
for the problem i had for rendering landscape pdf:s, in fop template you only need to add one more attribute to tell this file is in landscape layout.
The attribute is to set reference-orientation="90". Then your other definitions in the fop template will be applied properly.

Possible to put HTML annotations on PDF?

I know that we can now put text, links and videos..but can we put HTML as annotation as well?
If there's a SDK, please point me to it as well.
I have tried to search as much as possible but couldn't find anything on it.
Updated: okay, here are more details. I'm creating a script to create a PDF from an image, and at the same time have to place annotations on top of the image. When the person click the annotation, the HTML will be shown. I understand there are link annotations and shape annotation, but what I'm looking for is the ability to place HTML markup/codes in the annotation. For example, i would be able to design a simple form or style some text or even a embed YouTube video.
I hope I'm clear.
Thanks!
Here goes a very basic sample code : Please add Itext jar in your project
Code :
import com.itextpdf.text.Document;
import com.itextpdf.text.PageSize;
import com.itextpdf.text.Rectangle;
import com.itextpdf.text.pdf.PdfWriter;
import com.itextpdf.text.Image;
//input is image in String format
public void createfromimage(String input){
Document document = new Document(PageSize.A4.rotate());
document.setMargins(0,0,0,0);
String output = "C:/Users/username/Downloads/text.pdf";
try {
FileOutputStream fileOutputStream = new FileOutputStream(output);
PdfWriter pdfWriter = PdfWriter.getInstance(document, fileOutputStream);
Image image = Image.getInstance(input);
document.setPageSize(new Rectangle(image.getWidth(),image.getHeight()));
document.open();
pdfWriter.open();
document.add(image);
document.close();
pdfWriter.close();
} catch (Exception e){
e.printStackTrace();
}
}
You can add annotations in above way, and for Link annotations, refer to link below :
https://pdfbox.apache.org/apidocs/org/apache/pdfbox/pdmodel/interactive/annotation/PDAnnotationLink.html
please note, this is just a simple example.

Hyperlink Detection from PDF

I have some PDFs containing Hyperlinks both in form of URL and mailto. Now Is there any way or tool(may be 3rd party) to extract the Hyperlink meta information form the PDF like coordinates, link type and destination address. Any help is highly appreciated.
I have already tried with iText and PDFBox but with no major success, even some third party software are not providing me the desired output.
I have tried the following code in Java using iText
PdfReader myReader = new PdfReader("pdf File Path");
PdfDictionary pageDict = myReader.getPageN(1);
PdfArray annots = pageDict.getAsArray(PdfName.ANNOTS);
System.out.println(annots);
ArrayList<String> dests = new ArrayList<String>();
if(annots != null)
{
for(int i=0; i<annots.size(); ++i)
{
PdfDictionary annotDict = annots.getAsDict(i);
PdfName subType = annotDict.getAsName(PdfName.SUBTYPE);
if (subType != null && PdfName.LINK.equals(subType))
{
PdfDictionary action = annotDict.getAsDict(PdfName.A);
if(action != null && PdfName.URI.equals(action.getAsName(PdfName.S)))
{
dests.add(action.getAsString(PdfName.URI).toString());
} // else { its an internal link }
}
}
}
System.out.println(dests);
You can use Docotic.Pdf library for links extraction (disclaimer: I work for the company).
Below is the code that opens specified file, finds all hyperlinks, collects information about position of each link and draws rectangle around each links.
After that the code creates new PDF (with links in rectangles) and a text file with collected information. In the end, both created files are opened in default viewers.
public static void ListAndHighlightLinks(string inputFile, string outputFile, string outputTxt)
{
using (PdfDocument doc = new PdfDocument(inputFile))
{
StringBuilder sb = new StringBuilder();
for (int i = 0; i < doc.Pages.Count; i++)
{
PdfPage page = doc.Pages[i];
foreach (PdfWidget widget in page.Widgets)
{
PdfActionArea actionArea = widget as PdfActionArea;
if (actionArea == null)
continue;
PdfUriAction linkAction = actionArea.Action as PdfUriAction;
if (linkAction == null)
continue;
Uri url = linkAction.Uri;
PdfRectangle rect = actionArea.BoundingBox;
// add information about found link into string buffer
sb.Append("Page ");
sb.Append(i.ToString());
sb.Append(" : ");
sb.Append(rect.ToString());
sb.Append(" ");
sb.AppendLine(url.ToString());
// draw rectangle around found link
page.Canvas.DrawRectangle(rect);
}
}
// save document with highlighted links and text information about links to files
doc.Save(outputFile);
System.IO.File.WriteAllText(outputTxt, sb.ToString());
// open created PDF and text file in default viewers
System.Diagnostics.Process.Start(outputTxt);
System.Diagnostics.Process.Start(outputFile);
}
}
You can use the sample code with a call like this:
ListAndHighlightLinks("input.pdf", "output.pdf", "links.txt");
if your pdfs are copy protected, you need to start with step 1, if they're free to copy, you can start with step 2
step 1: convert your pdfs into word .doc: use Adobe Acrobat Pro or an online pdf to word converter:
http://www.pdfonline.com/pdf2word/index.asp
step 2: copy-paste the whole document into the input window here, you can also download the lightweight html tool:
http://www.surf7.net/services/value-added-services/free-web-tools/email-extractor-lite/
select 'url' as 'Type of address to extract', select your separator, hit extract and that's it.
Hope it works cheers.
One possibility would be using a custom JavaScript in Acrobat, which would enumerate the "words" on the page and then read out their Quads. From that you get the coordinates to create a link (or to compare with the links on the page), as well as the actual text (that's the "word(s)".
If it is "only" to set the border of the existing links, you also do another Acrobat JavaScript which enumerates the links of the document, and set their border color property (and you may need to set the width as well).
(if you prefer "buy" over "make" feel free to contact me in private; such things are part of my standard "repertoire").