Deleted annotation from pdf but still able to see that annotation in any viewer. What must be wrong in incremental update?
Attached pdf is http://www.filedropper.com/gettingstartedadobeencryptedtempannotations.
Please help me resolve this.
I extracted the former revision of your file and compared the first page of it with the first page of the final revision with each other as seen in Adobe Reader:
The former revision
The final revision
Quite clearly the annotation highlighting "01 Open a PDF from mail or web" is removed in the final revision.
Thus, your observation is at least incomplete, at least Adobe Reader does not show the removed annotation. Please indicate which viewers falsely do.
Furthermore inspecting the incremental update in the document it does look alright. There is just a small peculiarity, the cross reference entry for the single changed object points to the CRLF before the object, not the object itself.
Any viewer unable to cope with this will oftentimes have problems. Viewers actually are known to ignore much worse problems.
The popup for the annotation was not deleted. That's why (depending on the logic of the reader application) there's still a chance that the annotation is shown. At the end the annotation itself still exists and refereces to the page via its P entry (while it is missing in the annotations array):
You can see that the "deleted" comment (blue) is still availabe in the annotation structure:
So at the end you simply have to delete the popup annotation (183), too.
Related
Working with a PDF assembled from multiple PDFs the PDF Accessibity Checker (PAC) throws an error "PAC Unhandled Exception MCID 1 already present."
Is there any way to see and/or fix this issue from within Acrobat or ?? Is the MCID visible within an element's Tag?
What are MCIDs used for and does having duplicate MCIDs cause accessibility issues??
Not an answer, but "MCID n already present" -error appears in PAC (and in axesPDF which contains PAC) as soon as one edits the existing content in a tagged PDF so that some object is somehow rebuilt. Seemingly the tagging is something extra on the basic PDF structure and PDF editing handles tags only with the left hand.
An useful thing to do is to open Acrobat's Content panel which shows the objects, their layering order and tag containers where the objects are placed to.
PAC "MCID n already present" -errors stopped to appear after my PDF edits when I moved possibly soon to be affected objects out of the tag containers in the Content panel and deleted those containers. Then I did my edits, typically moved or changed some text and retagged it.
A person who really knows what Acrobat will do when one make edits maybe could tell which objects could stay in the tag container and move out only those which will be affected by the edits. And then move them back after editing. Moving all content out of the tag container and deleting the empty container, editing and finally retagging in the usual way after the edits has anyway worked.
I been looking all around for a solution to my issue but i can't find a fix yet. Here's my problem:
I have a dynamic PDF which contains a table and several text fields per row that grow vertically as the user adds text (multiple lines and expand to fit vertically). The table properly breaks when the content doesn't fit in the current page, however, I found out that in some scenarios, with a certain amount of characters, the row sometimes overlaps the content in the next page (See below).
By by adding more text to the offending line, the content in that row properly breaks to the next page (See below)
I am not sure whether or not this is a known issue with the tool or I am missing some sort of configuration/setting in the template. I haven't found anything online or in the Adobe Documentation. Any thoughts?
I am using:
Adobe Acrobat Pro 9
Adobe LiveCycle Designer ES 8.2
The form version of the PDF runs in Adobe Reader 7.0.5 (For compatibility purposes with one of the tools our clients are using)
Thanks in advance
After a long time looking for a solution, I found a single blog of someone who had the same issue, which by the way Adobe was kind enough to not document it... Anyhow, the following two processing instructions need to be added to the XML
<?layout allowDissonantSplits 1?>
<?layout allowJaggedRowSplits 1?>
My XML looks like this:
<template xmlns="http://www.xfa.org/schema/xfa-template/2.4/">
<?formServer defaultPDFRenderFormat acrobat7.0.5dynamic?>
<?formServer allowRenderCaching 0?>
<?formServer formModel both?>
<?layout allowDissonantSplits 1?>
<?layout allowJaggedRowSplits 1?>
The author specifies that the directives should only be added if this problem occurs. I wonder why these instructions should only be used in this situation. The full blog post can be found here:
http://blogs.adobe.com/dmcmahon/2011/10/10/lc-forms-es-text-overlapping-on-page-break-using-nested-subforms/
Hope this saves time to someone else
This is not a back-end programming question. I can only modify the markup or script (or the document itself). The reason I'm asking here is because all my searches for appropriate terms inevitably lead to questions and solutions about programming this functionality. I'm not trying to force it via progrmaming; I have to find out why this PDF is behaving differently.
So:
I have a bunch of links to PDFs on a page. Most of them open in new tabs, but one of them, the most recent, starts to open in a tab, but then the tab closes and the PDF gets downloaded as a file instead. All markup is consistent - there's nothing differnt about the odd-man-out except the actual URL.
You can see this here:
http://calwater.mwnewsroom.com/Investor-Relations/Financial-Reports/Annual-Reports
All annual reports up to 2012 open in a new tab, but 2013 downloads instead.
This leads me to believe that there is some meta-data property of the PDF itself that tells it how to open, and that, in this case, the 2013 PDF was created using different settings.
Apparently, the PDF was saved out to PDF from InDesign.
Does anyone have any insight?
Problem solved. There was simply an error in the string (like an extra period) that references the attachment such that it couldn't tell it was a PDF. Fixing the reference fixed the problem.
Per the following site...
http://forums.asp.net/t/1630140.aspx?extracting+pdf+pages+using+itextsharp
...I use the function ExtractPages to produce a new PDF based on range of page numbers. My problem is that I noticed a PDF that had a rectangle on the 2nd page was not extracted along with the page. This causes me some fear that perhaps Adobe comments are not being carried over as well as the pages get extracted.
Is there a way I can adjust this code to take into consideration to bring over comments and objects like rectangles to the new PDF when ExtractPages is called? Am I missing a syntax or is that not available with version 5.5.0 of iTextSharp?
Your use of the verb extract in the context of extracting pages is confusing. People will think you want to extract text from a page. In reality, you want to import or copy pages.
The example you refer to uses PdfWriter. That's wrong: you should use PdfStamper (if only one existing PDF is involved) or PdfCopy (if multiple existing PDFs are involved). See my answer to the question How to keep original rotate page in itextSharp (dll) to find out why the example on forums.asp.net is a really, really bad example.
The fact that a page has "a rectangle" is irrelevant. Maybe the rectangle is an annotation. In that case, you're throwing that rectangle away by using the wrong example. Maybe the origin of the page is different from 0,0...
If your purpose is to create a new PDF containing only a selection of pages of the original PDF, please read my answer to Function that I can use to remove a single page from a PDF using iText
I'm working on a tool in Python to extract highlighted passages from PDF files. I regularly highlight PDFs in Preview on OS X Lion but haven't found a good tool to extract these passages. Other apps exist that do allow you to highlight and export such as Skim but I figure there has to be a way to extract the ones I add in Preview.
I figured that the highlights would be stored in the HFS+ extended attributes for the PDF file but after looking at them using xattr it seems that they're stored elsewhere. I also looked at PDFKit but I only saw how to create annotations rather than locate them.
If someone could tell me where to find the highlights/annotations or point me at some documentation that explains this I would really appreciate it.
When using PDFKit you can get annotation from any PDFPage instance.
[myPDFPage annotations] will return an array of annotations for that particular page.
See the docs for more info.
Technically speaking, highlighting parts of a PDF is adding an annotation to the file. These annotations are PDF objects defined in the PDF specification. They are stored inside the PDF file itself, i.e. they do modify the original file! That's why you'll not find a trace of the highlights in the HFS+ extended attributes...
So the answer to the question of your title line is: Preview stores the highlights inside the PDF file as fully compliant PDF objects.
The answer to your real question implied in your text ('I want to extract the highlighted passages') was well answered by sosborn.