Read Index (Table of Content) of word document

Read Index (Table of Content) of word document - vba

I am interested to get the topic headings (say all lines with Heading 1 and Heading 2) from a word document. Using VBA you can parse thru every line in that document and verify the style; however this seems to be a tedious job. I believe that there should be some easy way of doing it. Any pointers

A pointer --->
tempD = ActiveDocument.GetCrossReferenceItems(wdRefTypeHeading)
gives you a list of Headings in document.

Related

Get bookmark index using system.Information()

I'm using below line to get a selected drop down from an endoresment.
ActiveDocument.FormFields(ActiveDocument.Range.Bookmarks(Selection.Information(30)).Name).Dropdown.Value
But I'm unable to get the correct bookmark index through Selection.Information(30) hence getting incorrect bookmark name.
Can any one please help me here.

The more "conventional" way to get the name of the currently active/selected bookmark name is:
Selection.Bookmarks(1).Name
Since this appears to be a form field, it's also possible to get the name via that collection:
Selection.Range.FormFields(1).Name
In a comment the request is for the bookmark index, although the request in the Question is for the bookmark name... In any case, to get the bookmark index get the count of all bookmarks from the start of the document to the end of the selection. (Note that this gets the index of the last previous bookmark, which is not necessary in the selection):
bkmIndex = ActiveDocument.Range(0, Selection.Range.End).Bookmarks.Count
Debug.Print ActiveDocument.Bookmarks(bkmIndex).Name
Please note that Information(30) is an old Word Basic value (I had to look it up in 1995 literature) that has no official equivalent in the VBA object model. It still works for reasons of backwards compatibility, but in such cases there are no guarantees that it will continue to work.

VBA getcrossreferenceitems(wdRefTypeNumberedItem) Paragraph Cut Off?

I'm using excel vba to extract information from a word document.
In the word document, there are levels of numbered lists. For example:
1. ABC
1.1 DEF
1.1.1 ABCDEF
2. AAA
2.1 BBB
2.1.1. CCC
and I need to get the full context of each heading in each level and put them into an excel range, i.e. {"1.ABC", "1.1 DEF", "1.1.1 ABCDEF", "2. AAA", "2.1 BBB", "2.1.1. CCC"}
The function I use is:
For Each sec In objDoc.getcrossreferenceitems(wdRefTypeNumberedItem)
However, my headings are truncated if the headings are too long. For example, I have (random text is added for confidentiality reasons):
"5.2.11. Current References: As part of the evaluation process, XXX will conduct 2340AERTQ3493YR. When selecting ADT34534FDGSR, please ensure that they are AERA34AEFDS."
But only
5.2.11. Current References: As part of the evaluation process, XXX will conduct 234
is displayed, and the rest of the sentence is gone.
If anybody has an alternate solution, please let me know.

i confirm this behavior. A workeable albeit and elaborate solution is to scan the document for all numbered items which gives you the full text and then cross reference that result against the list returned by the GetCrossReferenceItems. There's quite some work involved but works and gives you the ability to create one list with referable Headings and NumberedItems, which is what I did to build a more user friendly alternative to Word's own implementation.
You'll have to match the formatting Word applies to the list returned by GetCrossReferenceItems, ie. the identation and removal of special characters.
Be careful with track changes. There is a bug in GetCrossReferenceItems which means that items (in my case headers) that have a tracked change at the beginning of the text are not returned by GetCrossReferenceItems but internally are still on the list so the index is offset. If the item in question is item 11, then GetCrossReferenceItems gives the item belonging to item 12 the item 11. A workaround is to accept all revisions before GetCrossReferenceItems and undo it after.
It's not easy but works.

I met a similar problem in MSWord. I found some paragraph's text are shorten in the following code
Sub bug()
items = ActiveDocument.GetCrossReferenceItems(wdRefTypeNumberedItem)
For idx = 1 To UBound(items)
MsgBox items(idx)
Next
End Sub
I have to use a some long solution( in Python, sorry. But is is easy to rewrite in VBA):
varHeadings = []
for par in objDoc.Paragraph:
if par.Range.ListFormat.ListType == win32com.client.constants.wdListOutlineNumbering:
idx = par.Range.ListFormat.ListString
txt = par.Range.Text.strip('\n').strip('\r')
varHeadings.append('%s%s' % (idx, par.Range.Text))
which does work. However, as I have said, it is some tedious. So did I miss some VBA function in MSWord, or GetCrossReferenceItems has known bug and can not found any replacement in VBA?

Word / PDF - Merge Documents

I am looking to merge two documents, however it is not your typical merge.
My first document is a mailmerge, creating a cover letter, basically each page has a name and address
My next document is a static document that cannot be changed.
I need to insert the static document into my first merged document, but after every page, therefore, for every one page a document is inserted.
I have tried the insert document in both word 2010 and pdf using adobe acrobat, and as you have thought it only inserted one document after the first page.
I'm looking at VBA, but I have never utilized VBA and word before
Any pointers would be appreciated.
Many thanks

I should have spent more time on this.
The original template contains fields to merge.
On the static document that I mention, click insert tab, Text Section, select Object - Text From File
Select the cover letter / template that contains the fields to merge. This will insert the template followed by the static document that cannot be changed
Note I have spotted some formatting changes on the template following merge - further work required
From this point start your mail merge, and complete merge to Adobe or word.
This creates a mail merged document containing the cover letter with name and address fields followed by the static document.
Extremely simple. I always over complicate things!
I'll work on the changed formatting, but other then that this works

outlook 2007 - is there a way to get the formatted text from an Appointmentitem?

I'm trying to get the formatted text of the appointment item, I've searched everywhere and most places suggest getting the word document of the appointment item :
Word.Document wd = (Word.Document) (item as Outlook.AppointmentItem).GetInspector.WordEditor;
So I do that and I get the word document. But no where does it tell you what to actually do with this word document once you get it. How do I get the formatted text from the word document now?
UPDATE:
To anyone else searching for this answer in the future. I figured out how to do this in ol2007
1) First have have to get the word document from the appoint item via the WordEditor variable.
2) Then you have to use the select and copy functions from the word document to copy the RTF text into your clipboard.
3) make a richtextbox and use the richtextboc paste function to paste whats in the clipboard into your richtextbox.
4) now from the richtextbox you can access the .Rtf function which will now give you the RTF of the appointmentItem.
From my searching this method is the easiest way but you have to take over the clipboard which isn't ideal. There is a second way that I read about that is to save the word document in step 1 into an actually RTF file on your computer and then read in that RTF file.
and third way I suppose to do it would be to parse out the word document in step 1 using the Range.FormattedText function.

UPDATE: To anyone else searching for this answer in the future. I figured out how to do this in ol2007
1) First have have to get the word document from the appoint item via the WordEditor variable.
2) Then you have to use the select and copy functions from the word document to copy the RTF text into your clipboard.
3) make a richtextbox and use the richtextboc paste function to paste whats in the clipboard into your richtextbox.
4) now from the richtextbox you can access the .Rtf function which will now give you the RTF of the appointmentItem.
From my searching this method is the easiest way but you have to take over the clipboard which isn't ideal. There is a second way that I read about that is to save the word document in step 1 into an actually RTF file on your computer and then read in that RTF file.
and third way I suppose to do it would be to parse out the word document in step 1 using the Range.FormattedText function.

automating word 2010 to generate docs

the webapp was already done on office2007 and i need to convert it so it'll work in office2010.
i was able to convert the header generator part of the code but i have problem with the body of the doc itself. the code copy the data from a "data" doc and paste it into the generated doc.
appword.activewindow.activepane.view.seekview = 0
'set appsel1 = appword.activewindow.selection
set appsel1 = appword.window(filepath).selection -that is the original one
appdoc1.bookmarks("b1").select
appword.selection.insertafter("some text")
appsel1.endkey(6) -the code stops here
appword.selection.insertafter("some other text")
the iexplorer debuger says ERROR:appsel1 object required. and when i view its data using the iexplorer debugger its data is "empty" instead of "{...}"
can anyone tell me what i'm doing wrong
if you need more of the code tell me.

From MSDN
After this method is applied, the selection expands to include the new
text.
If you use this method with a selection that refers to an entire
paragraph, the text is inserted after the ending paragraph mark (the
text will appear at the beginning of the next paragraph). To insert
text at the end of a paragraph, determine the ending point and
subtract 1 from this location (the paragraph mark is one character).
However, if the selection ends with a paragraph mark that also happens
to be the end of the document, Microsoft Word inserts the text before
the final paragraph mark rather than creating a new paragraph at the
end of the document.
Also, if the selection is a bookmark, Word inserts the specified
text but does not extend the selection or the bookmark to include the
new text.
So I suspect that you still have no selected text.
I wonder if you can do a Selection Collapse(wdCollapseStart) but that's just a thought.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Read Index (Table of Content) of word document - vba

I am interested to get the topic headings (say all lines with Heading 1 and Heading 2) from a word document. Using VBA you can parse thru every line in that document and verify the style; however this seems to be a tedious job. I believe that there should be some easy way of doing it. Any pointers

A pointer ---> tempD = ActiveDocument.GetCrossReferenceItems(wdRefTypeHeading) gives you a list of Headings in document.

Related

Get bookmark index using system.Information()

VBA getcrossreferenceitems(wdRefTypeNumberedItem) Paragraph Cut Off?

Word / PDF - Merge Documents

outlook 2007 - is there a way to get the formatted text from an Appointmentitem?

automating word 2010 to generate docs

Categories

Resources