How to finalize edits in a PDF so that text is selectable? - pdf

I'm working on a pdf doc that includes input fields that have been edited. I'd like to be able to press select all and get all text from the document, but for some reason none of the input fields are included. Others have told me that if text isn't selectable, it must be an image, but in this case it's not - so is there a way to 'finalize' the edits and select the text? Maybe turn the doc into read only?
Edit: I've tried changing it to a read-only doc and it doesn't make the inputs selectable.

Related

How to pull text from a shape in a Word document using VBA?

I'm trying to grab the text from inside a shape on a Word document.
Sub textgrab()
MsgBox ActiveDocument.Shapes("Rectangle 85").TextFrame.TextRange.Text
End Sub
I get the error:
Run-time error '-2147024809 (80070057)':
The item with the specified name wasn't found.
In the Word document when I go to the top menu, hit the shape format tab, and in the arrange section, I select 'selection pane', I get a list of all the shapes, 'Rectangle 85' is there.
When I select it, it highlights the box i'm trying to grab the value from.
This is a pdf that I've opened in Word. I'm trying to automate a process that will open a pdf invoice, grab the dollar total, and pull it into Excel.
Solution for those that stumble upon this later. I used the following:
ActiveDocument.ActiveWindow.Panes(1).Pages(1).Rectangles.Item(i).Range
Word can only extract text from Drawing objects. These are inserted in the UI, for example, from Insert/Shapes. Shape.TextFrame.TextRange has no OCR capabilities, so can't be used to get text "embedded" in other kinds of graphic objects, such as an embedded PDF file or a JPG or anything similar.
When uncertain whether a particular Shape supports reading or writing text, right-click it in the UI and see if the menu selection Add Text or Edit Text is available.

Libre Office Labels don't show up as "AcroFields" in iTextSharp?

so I've been trying to generate a report. I've tried quite a few things already but there always seems to be problems. I'm currently trying iTextSharp 4.1.6.
My current strategy is to use LibreOffice to create a document with editable pdf fields, or I guess they are called "AcroFields". I'm not sure since I can't find a definition. But anyways, I assume that all of these are "AcroFields":
But if I put all of those into a form and export as pdf only some of them show up as AcroFields:
var reader = new PdfReader(File.ReadAllBytes("abc.pdf"));
foreach(var field in reader.AcroFields.Fields)
{
Console.WriteLine(((DictionaryEntry)field).Key);
}
> Text Box 1
Check Box 1
Numeric Field 1
Formatted Field 1
Date Field 1
List Box 1
Combo Box 1
Push Button 1
Option Button 1
Notice how Label Field 1 is not present. If it were present then doing a text replace might be easy. Except it's not present so it's looking like even iText can't do a simple text replace in a pdf. Is this true? How would you replace text in a pdf document using iTextSharp?
Notice how Label Field 1 is not present.
As there is no AcroForm form field type "label", form labels usually are drawn as regular page content in PDF files.
If it were present then doing a text replace might be easy. Except it's not present so it's looking like even iText can't do a simple text replace in a pdf. Is this true?
Indeed, in general there is no simple text replacement in a PDF.
How would you replace text in a pdf document using iTextSharp?
I would determine the bounding box coordinates of the text to replace using the iText text extraction feature with some extension that returns text plus coordinates. Then I'd remove that text by redaction using iText's PdfCleanUp... classes. Finally I'd add the replacement text as new text in the bounding box determined at start.
Unfortunately for you, both good text extraction and redaction are not present in your version 4.1.6; for this approach you should update at least to 5.5.x.
Alternatively, though, as you've been trying to generate a report, I assume the template design is in your hands. In that case you can put your labels into read-only text fields which you can change (they are read-only only to GUI users).

How do I fill out fillable PDF Form fields using 4gl?

I have a PDF form that I'm filling out with data using progress-4gl. To date, I've been only filling in text fields using the following syntax:
put stream stream1 unform
"^global CHX_SINGLE_CE_PLAN3" skip(0)
"X" skip
CHX_SINGLE_CE_PLAN3 is the field name...
This code works when dealing with text fields but I'm trying to check a box instead of fill in a text field. I cannot find any documentation on this. Is checking a box on a fillable pdf form even possible with 4gl?
As far as I remember PDF Include has support for filling fillable forms. Whilst it's probably a bit over the top in terms of what you want to achieve, it's an open source project and so you may well find the answer to your question within the code itself.
Here's a link to the project page: http://www.oehive.org/pdfinclude
I discovered the answer, which I thought I had already tried before asking this question. The answer is you need to pass the value "Yes" (with capital "Y") in order to check the checkbox. The correct code in this instance is:
put stream stream1 unform
"^global CHX_SINGLE_CE_PLAN3" skip(0)
"Yes" skip
I believe this is the case no matter which language you're using

Microsoft Word MacroButton - placeholder text visibility

I have a Microsoft Office 2013 Word template, in which I have some text-field elements, created by using Quick Parts -> Field -> MACROBUTTON noname [Type your text here].
If I fill only some of these fields (i.e. "[Name]", "[Address]") and I print or save as PDF, all the fields that I have not filled will display as [Insert your text here] in the printed paper or PDF. To be clear, the placeholder text must be manually removed (or replaced with the text you want).
I've readed somewhere, that you can create a macro, which will not display the placeholder text in the PFD- or printed version of the document, if there is no text written manually to that specific field (you leave it as it was). As this would be handy in cases, where you don't fill all the neccessery fields, my question is:
Q: Can this be achieved only by using Macro Button, and if not, what is needed to create text fields as described below that are not included in the printed or PDF saved version of the document?
This cannot be achieved without using actual macro code. Right now your solution contains no macro code, the fields simply function as "targets" and when the user types on the field it is deleted. Where the user does not type, the prompt remains. You'd need code to delete these fields from the document.
Given your requirement, the code would have to fire in the DocumentBeforeSave and the DocumentBeforePrint events. These events require a class and supporting code in a standard module. The basic information on how to set these up is in the Word object model language reference: https://msdn.microsoft.com/en-us/library/office/ff821218.aspx
An alternative to MacroButton fields would be to use ContentControls. But here, again, code and the same events would be required to remove/hide placeholder text.

vb parse word from clipboard into html

I want to copy some word text with tables, links and images and paste it into a rich textbox in my vb project. Where I want to parse it to html.
The Questenion is, how can access the copied word in the clipboard. Using
My.Computer.Clipboard.GetText()
just returns text, without the structure for links, tables or images. But there have to be a way to access it, because my rich textbox seems to know the word format. When I paste it into it, tables, images and links are also displayed in there.
You can try to specify text format in GetText's parameter as follow :
Dim htmlText As String = Clipboard.GetText(TextDataFormat.Html)