How does one save the .MoreInfo property of a PDF with iTextSharp? - pdf

I currently have the following class that I'm trying to add a Hashtable of metadata properties to a PDF. The problem is, even though it appears to assign the hashtable to the stamper.MoreInfo property it doesn't appear to save the MoreInfo property once the stamper is closed.
public class PdfEnricher
{
readonly IFileSystem fileSystem;
public PdfEnricher(IFileSystem fileSystem)
{
this.fileSystem = fileSystem;
}
public void Enrich(string pdfFile, Hashtable fields)
{
if (!fileSystem.FileExists(pdfFile)) return;
var newFile = GetNewFileName(pdfFile);
var stamper = GetStamper(pdfFile, newFile);
SetFieldsAndClose(stamper, fields);
}
string GetNewFileName(string pdfFile)
{
return fileSystem.GetDirectoryName(pdfFile) + #"\NewFileName.pdf";
}
static void SetFieldsAndClose(PdfStamper stamper, Hashtable fields)
{
stamper.MoreInfo = fields;
stamper.FormFlattening = true;
stamper.Close();
}
static PdfStamper GetStamper(string pdfFile, string newFile)
{
var reader = new PdfReader(pdfFile);
return new PdfStamper(reader, new FileStream(newFile, FileMode.Create));
}
}
Any ideas?

As always, Use The Source.
In this case, I saw a possibility fairly quickly (Java source btw):
public void close() throws DocumentException, IOException {
if (!hasSignature) {
stamper.close( moreInfo );
return;
}
Does this form already have signatures of some sort? Lets see when hasSignatures would be true.
That can't be the case with your source. hasSignatures is only set when you sign a PDF via PdfStamper.createSignature(...), so that's clearly not it.
Err... how are you checking that your MoreInfo was added? It won't be in the XMP metadata. MoreInfo is added directly to the Doc Info dictionary. You see them in the "Custom" tab of Acrobat (and most likely Reader, though I don't have it handy at the moment).
Are you absolutely sure MoreInfo isn't null, and all its values aren't null?
The Dictionary is just passed around by reference, so any changes (in another thread) would be reflected in the PDF as it was written.
The correct way to iterate through a document's "Doc info dictionary":
PdfReader reader = new PdfReader(somePath);
Map<String, String> info = reader.getInfo();
for (String key : info.keySet()) {
System.out.println( key + ": " + info.get(key) );
}
Note that this will go through all the fields in the document info dictionary, not just the custom ones. Also be aware that changes made the the Map from getInfo() will not carry over to the PDF. The map is new'ed, populated, and returned.

Related

The data changes when I try to pull it back C#

I'm trying to save a generic list and get it back by using a BinaryFormatter but I can't get the list in the form that I have saved, it returns me only the first item in the list. I think there might be an error while the code tries not to overwrite the file. If you need more details, please tell me and I'll add the details that you need.
#region Save
/// <summary>
/// Saves the given object to the given path as a data in a generic list.
/// </summary>
protected static void Save<T>(string path, object objectToSave)
{
BinaryFormatter formatter = new BinaryFormatter();
FileStream stream;
if (!File.Exists(path))
{
stream = File.Create(path);
}
else
{
stream = File.Open(path, FileMode.Open);
}
List<T> list = new List<T>();
try
{
list = (List<T>)formatter.Deserialize(stream);
}
catch
{
}
list.Add((T)objectToSave);
formatter.Serialize(stream, list);
stream.Close();
}
#endregion
#region Load
/// <summary>
/// Loads the data from given path and returns a list of questions.
/// </summary>
protected static List<T> Load<T>(string path)
{
if (!File.Exists(path))
{
System.Windows.Forms.MessageBox.Show(path + " yolunda bir dosya bulunamadı!");
return null;
}
BinaryFormatter formatter = new BinaryFormatter();
FileStream stream = File.Open(path, FileMode.Open);
List<T> newList;
try
{
newList = (List<T>)formatter.Deserialize(stream);
}
catch
{
newList = null;
}
stream.Close();
return newList;
}
#endregion
Okey, I just figured the problem. Appearently if you make a change in the data without saving it (I did it in "list = (List)formatter.Deserialize(stream);" this line of code) and then if you try to serialize it again, the FileStrem that you are using doesn't work generically, so you have to close the old stream and than reopen it or another again or just simply type stream = File.Open(path, FileMode.Open); again. Thanks anyway :D

How to set interactive PDF form in read-only mode while writing as new PDF by using Apache PDFBox?

I am using Apache PDFBox library to fill information in fillable PDF form(AcroFrom). After complete information filling, I needs to write as a new PDF file (in non-editable format).
I tried setReadOnly() method, which is available in AccessPermission class. But still I can able to edit the values in new created PDF document.
Code:
private static PDDocument _pdfDocument;
public static void main(String[] args) {
String originalPdf = "C:/sample/Original.pdf";
String targetPdf = "C:/sample/target.pdf";
try {
populateAndCopy(originalPdf, targetPdf);
-----------
-----------
-----------
-----------
}
} // Main method complted
private static void populateAndCopy(String originalPdf, String targetPdf) throws IOException, COSVisitorException {
_pdfDocument = PDDocument.load(originalPdf);
_pdfDocument.getNumberOfPages();
_pdfDocument.getCurrentAccessPermission().setCanModify(false);
_pdfDocument.getCurrentAccessPermission().setReadOnly();
System.out.println(_pdfDocument.getCurrentAccessPermission().isReadOnly());
_pdfDocument.save(targetPdf);
_pdfDocument.close();
}
Please help me to fix this issue.
Your code does not set any encryption, that is the problem.
Try this:
AccessPermission ap = new AccessPermission();
ap.setCanModify(false);
ap.setReadOnly();
StandardProtectionPolicy spp = new StandardProtectionPolicy("owner-password", "", ap);
spp.setEncryptionKeyLength(128);
doc.protect(spp);
doc.save(targetPdf);
doc.close();
I've set 128 as the keylength as 256 is not supported in 1.8 and 40 is too short.
A user will be able to open the document without password (see the empty password parameter), but he'll have only the restricted rights.
public static void main(String[] args) {
try {
String formTemplate = "xyz.pdf";
// load the document
PDDocument pdfDocument = PDDocument.load(new File(formTemplate));
// get the document catalog
PDAcroForm acroForm = pdfDocument.getDocumentCatalog().getAcroForm();
// as there might not be an AcroForm entry a null check is necessary
if (acroForm != null)
{
// Retrieve an individual field and set its value.
PDTextField field1 = (PDTextField) acroForm.getField( "_lastName" );
field1.setValue("pppp");
PDTextField field2 = (PDTextField) acroForm.getField( "_firstName" );
field2.setValue(aaaa");
}
// flatten() method saves the PDF read only
acroForm.flatten();
// Save and close the filled out form.
pdfDocument.save("xyz.pdf");
pdfDocument.close();
System.out.println("Done!!");
} catch(Exception e) {
e.printStackTrace();
}
}

Skip adding empty tables to PDF when parsing XHTML using ITextSharp

ITextSharp throws an error when you attempt to create a PdfTable with 0 columns.
I have a requirement to take XHTML that is generated using an XSLT transformation and generate a PDF from it. Currently I am using ITextSharp to do so. The problem that I am having is the XHTML that is generated sometimes contains tables with 0 rows, so when ITextSharp attempts to parse them into a table it throws and error saying there are 0 columns in the table.
The reason it says 0 columns is because ITextSharp sets the number of columns in the table to the maximum of the number of columns in each row, and since there are no rows the max number of columns in any given row is 0.
How do I go about catching these HTML table declarations with 0 rows and stop them from being parsed into PDF elements?
I've found the piece of code that is causing the error is within the HtmlPipeline, so I could copy and paste the implementation into a class extending HtmlPipeline and overriding its methods and then do my logic to check for empty tables there, but that seems sloppy and inefficient.
Is there a way to catch the empty table before it is parsed?
=Solution=
The Tag Processor
public class EmptyTableTagProcessor : Table
{
public override IList<IElement> End(IWorkerContext ctx, Tag tag, IList<IElement> currentContent)
{
if (currentContent.Count > 0)
{
return base.End(ctx, tag, currentContent);
}
return new List<IElement>();
}
}
And using the Tag Processor...
//CSS
var cssResolver = XMLWorkerHelper.GetInstance().GetDefaultCssResolver(true);
//HTML
var fontProvider = new XMLWorkerFontProvider();
var cssAppliers = new CssAppliersImpl(fontProvider);
var tagProcessorFactory = Tags.GetHtmlTagProcessorFactory();
tagProcessorFactory.AddProcessor(new EmptyTableTagProcessor(), new string[] { "table" });
var htmlContext = new HtmlPipelineContext(cssAppliers);
htmlContext.SetTagFactory(tagProcessorFactory);
//PIPELINE
var pipeline =
new CssResolverPipeline(cssResolver,
new HtmlPipeline(htmlContext,
new PdfWriterPipeline(document, pdfWriter)));
//XML WORKER
var xmlWorker = new XMLWorker(pipeline, true);
using (var stringReader = new StringReader(html))
{
xmlParser.Parse(stringReader);
}
This solution removes the empty table tags and still writes the PDF as a part of the pipeline.
You should be able to write your own tag processor that accounts for that scenario by subclassing iTextSharp.tool.xml.html.AbstractTagProcessor. In fact, to make your life even easier you can subclass the already existing more specific iTextSharp.tool.xml.html.table.Table:
public class TableTagProcessor : iTextSharp.tool.xml.html.table.Table {
public override IList<IElement> End(IWorkerContext ctx, Tag tag, IList<IElement> currentContent) {
//See if we've got anything to work with
if (currentContent.Count > 0) {
//If so, let our parent class worry about it
return base.End(ctx, tag, currentContent);
}
//Otherwise return an empty list which should make everyone happy
return new List<IElement>();
}
}
Unfortunately, if you want to use a custom tag processor you can't use the shortcut XMLWorkerHelper class and instead you'll need to parse the HTML into elements and add them to your document. To do that you'll need an instance of iTextSharp.tool.xml.IElementHandler which you can create like:
public class SampleHandler : iTextSharp.tool.xml.IElementHandler {
//Generic list of elements
public List<IElement> elements = new List<IElement>();
//Add the supplied item to the list
public void Add(IWritable w) {
if (w is WritableElement) {
elements.AddRange(((WritableElement)w).Elements());
}
}
}
You can use the above with the following code which includes some sample invalid HTML.
//Hold everything in memory
using (var ms = new MemoryStream()) {
//Create new PDF document
using (var doc = new Document()) {
using (var writer = PdfWriter.GetInstance(doc, ms)) {
doc.Open();
//Sample HTML
string html = "<table><tr><td>Hello</td></tr></table><table></table>";
//Create an instance of our element helper
var XhtmlHelper = new SampleHandler();
//Begin pipeline
var htmlContext = new HtmlPipelineContext(null);
//Get the default tag processor
var tagFactory = iTextSharp.tool.xml.html.Tags.GetHtmlTagProcessorFactory();
//Add an instance of our new processor
tagFactory.AddProcessor(new TableTagProcessor(), new string[] { "table" });
//Bind the above to the HTML context part of the pipeline
htmlContext.SetTagFactory(tagFactory);
//Get the default CSS handler and create some boilerplate pipeline stuff
var cssResolver = XMLWorkerHelper.GetInstance().GetDefaultCssResolver(false);
var pipeline = new CssResolverPipeline(cssResolver, new HtmlPipeline(htmlContext, new ElementHandlerPipeline(XhtmlHelper, null)));//Here's where we add our IElementHandler
//The worker dispatches commands to the pipeline stuff above
var worker = new XMLWorker(pipeline, true);
//Create a parser with the worker listed as the dispatcher
var parser = new XMLParser();
parser.AddListener(worker);
//Finally, parse our HTML directly.
using (TextReader sr = new StringReader(html)) {
parser.Parse(sr);
}
//The above did not touch our document. Instead, all "proper" elements are stored in our helper class XhtmlHelper
foreach (var element in XhtmlHelper.elements) {
//Add these to the main document
doc.Add(element);
}
doc.Close();
}
}
}

The IsolatedStorageSettings.Save method in Windows Phone: does it save the whole dictionary?

Does the IsolatedStorageSettings.Save method in a Windows Phone application save the whole dictionary regardless of the changes we made in it? I.e. if we have say 50 items in it, and change just one, does the Save method saves (serializes, etc) the whole dictionary again and again? Is there any detailed documentation on this class and does anybody know what data storage format is used "under the hood"?
I've managed to find the implementation of the IsolatedStorageSettings.Save method in the entrails of the Windows Phone emulator VHD images supplied with the Windows Phone SDK (the answer to this question on SO helped me to do that). Here is the source code of the method:
public void Save()
{
lock (this.m_lock)
{
using (IsolatedStorageFileStream isolatedStorageFileStream = this._appStore.OpenFile(this.LocalSettingsPath, 4))
{
using (MemoryStream memoryStream = new MemoryStream())
{
Dictionary<Type, bool> dictionary = new Dictionary<Type, bool>();
StringBuilder stringBuilder = new StringBuilder();
using (Dictionary<string, object>.ValueCollection.Enumerator enumerator = this._settings.get_Values().GetEnumerator())
{
while (enumerator.MoveNext())
{
object current = enumerator.get_Current();
if (current != null)
{
Type type = current.GetType();
if (!type.get_IsPrimitive() && type != typeof(string))
{
dictionary.set_Item(type, true);
if (stringBuilder.get_Length() > 0)
{
stringBuilder.Append('\0');
}
stringBuilder.Append(type.get_AssemblyQualifiedName());
}
}
}
}
stringBuilder.Append(Environment.get_NewLine());
byte[] bytes = Encoding.get_UTF8().GetBytes(stringBuilder.ToString());
memoryStream.Write(bytes, 0, bytes.Length);
DataContractSerializer dataContractSerializer = new DataContractSerializer(typeof(Dictionary<string, object>), dictionary.get_Keys());
dataContractSerializer.WriteObject(memoryStream, this._settings);
if (memoryStream.get_Length() > this._appStore.get_AvailableFreeSpace() + isolatedStorageFileStream.get_Length())
{
throw new IsolatedStorageException(Resx.GetString("IsolatedStorageSettings_NotEnoughSpace"));
}
isolatedStorageFileStream.SetLength(0L);
byte[] array = memoryStream.ToArray();
isolatedStorageFileStream.Write(array, 0, array.Length);
}
}
}
}
So, as we can see the whole dictionary is serialized every time when we call Save. And we can see from code what method is used to serialize the collection values.

Checking off pdf checkbox with itextsharp

I've tried so many different ways, but I can't get the check box to be checked! Here's what I've tried:
var reader = new iTextSharp.text.pdf.PdfReader(originalFormLocation);
using (var stamper = new iTextSharp.text.pdf.PdfStamper(reader,ms)) {
var formFields = stamper.AcroFields;
formFields.SetField("IsNo", "1");
formFields.SetField("IsNo", "true");
formFields.SetField("IsNo", "On");
}
None of them work. Any ideas?
You shouldn't "guess" for the possible values. You need to use a value that is stored in the PDF. Try the CheckBoxValues example to find these possible values:
public String getCheckboxValue(String src, String name) throws IOException {
PdfReader reader = new PdfReader(SRC);
AcroFields fields = reader.getAcroFields();
// CP_1 is the name of a check box field
String[] values = fields.getAppearanceStates("IsNo");
StringBuffer sb = new StringBuffer();
for (String value : values) {
sb.append(value);
sb.append('\n');
}
return sb.toString();
}
Or take a look at the PDF using RUPS. Go to the widget annotation and look for the normal (/N) appearance (AP) states. In my example they are /Off and /Yes: