Easiest way to add raw Open XML code to a Word document - vba

I want to insert a chunk of formatted text into a Word document in VBA. Since this will be done server-side, I am not advised take these chunks from another Word document using Office Interop (link), so I presume it would be easiest to use Open XML like this <w:p><w:r><w:t>String from WriteToWordDoc method.</w:t></w:r></w:p> Sadly Application.ActiveDocument.Range.InsertXML fails. So what other quick and dirty alternatives do I have?

For the "bare necessities" see this MSDN article:
https://msdn.microsoft.com/en-us/library/office/dn423225.aspx?f=255&MSPPError=-2147217396
It describes what the minimum requirements are for inserting Open XML into an open Word document. Don't worry that the discussion targets Web add-ins - the principle is the same for any API that inserts WordOpenXML into a Word document at run-time.
I may be misunderstanding where "server-side" is involved in your process, so forgive me if the following is not relevant to your situation: Note that using VBA server-side is a bad idea - Word is not designed to run in a server environment. Better would be to use the Open XML SDK both to retrieve and write the information between two Word documents.

Related

Import .docx contents into MS Access

I began writing a docx document to do a project of mine.
Recently, I realized that it would be easier to manage that data if it was in a database.
So, I wanted to import that data into MS Access automatically, to avoid copying and pasting the data manually.
Is there anyway to do it? I have only encontered ways of opening Word application via Access. I also know that docx has a XML structure, so I imagine if I can open that structure, it would be easy to do a parser in VBA
There are two basic ways information can be taken out of a Word document and put into an Access database: automating the Word object model using VBA code running in either Word or Access OR extracting the WordOpenXML that makes up the Word document. You indicate you lean towards the second option.
Here, again, there are a number of approaches available:
Use VBA in Word or Access to extract the WordOpenXML of the document open in the Word application user interface.
Use VBA in Access together with non-VBA tools to "crack open" the Zip file and extract the XML.
Use the tools available in the .NET Framework to extract the content of the ZIP file and write it to Access using an OLE DB connection.
I understand your goal is to be able to recreate the document at a later point for printing, so you want to preserve all the formatting. In addition, you want to be able to read the content from within Access.
I believe this will require a minimum of four fields in the Access table:
ID
Title
Text of song
The complete WordOpenXML for re-creating the document
You don't mention (4) in the discussion and problem description, but if you want to store the formatting AND you want to be able to read the content I believe this is necessary. While WordOpenXML is "readable", there's a lot of mark-up in there which doesn't make reading comfortable.
All things being equal, I'd go for either VBA working on the open Word document or the .NET approach, using the Open XML SDK (free download .NET library you can reference in Visual Studio and distribute with solutions).
One important thing to keep in mind is storing the Word Open XML in the database. Unless something has changed in Access, you can't store the ZIP file - you need a "streamable" format. That would be the OOXML OPC flat-file format.
When you read the WordOpenXML from a document using VBA, that's what you get, which is why that would be an option for me. The Open XML SDK doesn't have that option, but there is code available from Eric White's blog for doing this.
When you later want to recreate and print the document it should be enough to stream the WordOpenXML to a file with the .xml extension. Or you could convert it back to a docx zip file (same blog).

Manipulation of DOCX in VBA?

In C#/.NET I am able to open a DOCX file as a ZipPackage, then manipulate its XML parts separately by getting them as PackageParts and reading from / writing to their Streams using .GetStream().
As far as I'm aware, VBA is a million miles away from having this functionality (especially given that I've not found anything about it after a lot of web searching), but I just thought I'd check: can any VBA aficionados confirm or deny whether VBA has some kind of built-in functionality for manipulating DOCX ZipPackages, or would you pretty much have to write your own VBA DOCX parser from scratch?
Mostly the answer is no, but there is a glimmmer of yes.
Regarding your specific question of managing PackageParts from their streams. You could probably do this with some kind of "unzip" utility and then knowing the OPC structure, navigate to whereever you want in the Parts and change things with XSLT or other XML manipulation technologies - but you wouldn't be able to do this on the ActiveDocumentbecause its stream is already in use by nature of it being open. You could use VBA to create a copy of it and manipulate that one in a really cumbersome manner and when your manipulation is done, have VBA close and delete the current ActiveDocument and open your manipulated one as the new ActiveDocument.
On the other hand, there is a way to manipulate WordprocessingML for the current ActiveDocument from VBA, but it would be incredibly difficult to do. Have an open document, select something and then to the VBE. Then run this:
Sub InjectXML()
Dim wd As Document: Set wd = ActiveDocument
Debug.Print wd.Range.WordOpenXML
End Sub
You'll see all the WordprocessingML spit out in the immediate window. This is actually "Flat OPC WordprocessingML" as it is all maintained in a single string of XML. By using ActiveDocument.Range.InsertXML, you could technically insert Flat OPC-type of WordprocessingML back into the document at the selected location. Here's an example of someone using C# to do this via interop and Linq-to-XML. To do this in VBA would be incredibly hard.
So again, the answer is mostly "no", but a little bit of "yes".
You do not state your ultimate aim, but I guess one possibility is to use Microsoft.office.interop.word to manipulate the word document.

I need a (preferably free) PDF/Word generator .Net component that can work from a document template

I'm looking for a .Net component that will allow me to generate Word and/or PDF documents.
This must work on the server without MS Office installation. Preferably free. Also, it needs to be able to generate the documents based on an existing template of some sort i.e. I don't want to generate the whole document from scratch but allow a number of different templates that all have similar content that comes from elsewhere (e.g. database, XML files etc).
My initial investigations have turned up iTextSharp (but not sure if it can work from templates).
Any help that can expedite my investigation time will be much appreciated.
Thanks
I use ActivePDF at work with .NET - give it some HTML and it will output a pdf doc. However it isn't free - but we did look at a few other ways and this was 1
http://pdfcrowd.com/html-to-pdf-api/
It doesn't do word documents but converts html (your template) to pdf

Generating PDF documents from LISP

I want to generate a technical report from lisp (AllegroCL in my case) and I studied various packages/project to help me do this.
Requirements:
Need to generate a PDF
May create an intermediate format like RTF, Restructured TEXT, HTML, Word DOC or Latex
Need to be flexible to be able to add content throughout my application
Need to handle Multi-Page, Headers, Footers, Tables, inclusion of Images.
Possibilities:
cl-pdf and cl-typesetting: I checked this one out and it works for now, but is there a better alternative?
Some Latex generator, but ???
Question:
Do you know alternatives to easily generate (PDF) reports from lisp. What is the best workflow to go for?
we are using cl-pdf and cl-typesetting for the last 3 years and it has numerous issues... (like its confusion around encodings, or silently not rendering things that don't fit, or...) so, i don't recommend new development based on them.
currently we are in the process of moving all our export mechanisms to open document format. openoffice is all happy with it, and there's a plugin for ms office, too.
there's .fodt, the so called flat open document text format, which is a mere xml file describing a document. generating it is as easy as generating xml files.
you can also make parts of your document read-only with a password (insert a section and mark it read-only and protected by a password. when generating the xml, you can generate random hashes as password...).

Storing arbitrary meta-data in Microsoft Word document

I need to store custom meta-data in a Word document. The 'document properties' are limited to 255 bytes but I have data which is >> 10k
We are using VBA to write a word extension to interact with our application and want to have our application data stored in the word document. The idea is that that the user can share just the word document without sharing any other data files of our application.
Has anyone ideas how to store arbitrary meta-data efficently in Office 2003+ documents?
Just tried this in the VBA editor immediate window:
ActiveDocument.Variables.Add "foo", String(10000, ".")
? Len(ActiveDocument.Variables("foo"))
10000
Document variables are stored with the document. They seem to go up to 65280 Unicode characters, as I just found out (tested in Word 2003; Update: This limit is the same in Word 2007 and 2010).
If you fear to get near the 60k for your data, I'd recommend either splitting it over several variables, or compressing it before you store it. This post shows some options to do that in VBA.
It fits not exactly your requirements because it does not work with Word 2003. However, it still might be interesting to you (otherwise I recommend you to use document variables as suggested by Tomalak):
In Word 2007 (and later) you may embed arbitrary XML documents in a Word document as Custom XML.
An example is described ins this article on MSDN: How to: Add Custom XML Parts to Documents by Using VSTO Add-Ins