In C#/.NET I am able to open a DOCX file as a ZipPackage, then manipulate its XML parts separately by getting them as PackageParts and reading from / writing to their Streams using .GetStream().
As far as I'm aware, VBA is a million miles away from having this functionality (especially given that I've not found anything about it after a lot of web searching), but I just thought I'd check: can any VBA aficionados confirm or deny whether VBA has some kind of built-in functionality for manipulating DOCX ZipPackages, or would you pretty much have to write your own VBA DOCX parser from scratch?
Mostly the answer is no, but there is a glimmmer of yes.
Regarding your specific question of managing PackageParts from their streams. You could probably do this with some kind of "unzip" utility and then knowing the OPC structure, navigate to whereever you want in the Parts and change things with XSLT or other XML manipulation technologies - but you wouldn't be able to do this on the ActiveDocumentbecause its stream is already in use by nature of it being open. You could use VBA to create a copy of it and manipulate that one in a really cumbersome manner and when your manipulation is done, have VBA close and delete the current ActiveDocument and open your manipulated one as the new ActiveDocument.
On the other hand, there is a way to manipulate WordprocessingML for the current ActiveDocument from VBA, but it would be incredibly difficult to do. Have an open document, select something and then to the VBE. Then run this:
Sub InjectXML()
Dim wd As Document: Set wd = ActiveDocument
Debug.Print wd.Range.WordOpenXML
End Sub
You'll see all the WordprocessingML spit out in the immediate window. This is actually "Flat OPC WordprocessingML" as it is all maintained in a single string of XML. By using ActiveDocument.Range.InsertXML, you could technically insert Flat OPC-type of WordprocessingML back into the document at the selected location. Here's an example of someone using C# to do this via interop and Linq-to-XML. To do this in VBA would be incredibly hard.
So again, the answer is mostly "no", but a little bit of "yes".
You do not state your ultimate aim, but I guess one possibility is to use Microsoft.office.interop.word to manipulate the word document.
Related
I began writing a docx document to do a project of mine.
Recently, I realized that it would be easier to manage that data if it was in a database.
So, I wanted to import that data into MS Access automatically, to avoid copying and pasting the data manually.
Is there anyway to do it? I have only encontered ways of opening Word application via Access. I also know that docx has a XML structure, so I imagine if I can open that structure, it would be easy to do a parser in VBA
There are two basic ways information can be taken out of a Word document and put into an Access database: automating the Word object model using VBA code running in either Word or Access OR extracting the WordOpenXML that makes up the Word document. You indicate you lean towards the second option.
Here, again, there are a number of approaches available:
Use VBA in Word or Access to extract the WordOpenXML of the document open in the Word application user interface.
Use VBA in Access together with non-VBA tools to "crack open" the Zip file and extract the XML.
Use the tools available in the .NET Framework to extract the content of the ZIP file and write it to Access using an OLE DB connection.
I understand your goal is to be able to recreate the document at a later point for printing, so you want to preserve all the formatting. In addition, you want to be able to read the content from within Access.
I believe this will require a minimum of four fields in the Access table:
ID
Title
Text of song
The complete WordOpenXML for re-creating the document
You don't mention (4) in the discussion and problem description, but if you want to store the formatting AND you want to be able to read the content I believe this is necessary. While WordOpenXML is "readable", there's a lot of mark-up in there which doesn't make reading comfortable.
All things being equal, I'd go for either VBA working on the open Word document or the .NET approach, using the Open XML SDK (free download .NET library you can reference in Visual Studio and distribute with solutions).
One important thing to keep in mind is storing the Word Open XML in the database. Unless something has changed in Access, you can't store the ZIP file - you need a "streamable" format. That would be the OOXML OPC flat-file format.
When you read the WordOpenXML from a document using VBA, that's what you get, which is why that would be an option for me. The Open XML SDK doesn't have that option, but there is code available from Eric White's blog for doing this.
When you later want to recreate and print the document it should be enough to stream the WordOpenXML to a file with the .xml extension. Or you could convert it back to a docx zip file (same blog).
I want to insert a chunk of formatted text into a Word document in VBA. Since this will be done server-side, I am not advised take these chunks from another Word document using Office Interop (link), so I presume it would be easiest to use Open XML like this <w:p><w:r><w:t>String from WriteToWordDoc method.</w:t></w:r></w:p> Sadly Application.ActiveDocument.Range.InsertXML fails. So what other quick and dirty alternatives do I have?
For the "bare necessities" see this MSDN article:
https://msdn.microsoft.com/en-us/library/office/dn423225.aspx?f=255&MSPPError=-2147217396
It describes what the minimum requirements are for inserting Open XML into an open Word document. Don't worry that the discussion targets Web add-ins - the principle is the same for any API that inserts WordOpenXML into a Word document at run-time.
I may be misunderstanding where "server-side" is involved in your process, so forgive me if the following is not relevant to your situation: Note that using VBA server-side is a bad idea - Word is not designed to run in a server environment. Better would be to use the Open XML SDK both to retrieve and write the information between two Word documents.
I've been asked to create a macro to update a few hundred or so Visio drawings, and keep them updated.
The update involves putting all objects of a certain type on their own layer - simple.
Now, this is easy enough to do, but when a user adds a new object some time in the future it will likely be on a default layer. So I had hoped to be able to include a VBA macro which is triggered by the Save event to re-assign objects to their layers.
The problem here is that I'd need to include this macro in every document since Visio doesn't have an application level VBA project.
Is there any way to introduce a VBA project to ALL Visio documents using code (VBA or otherwise)?? Or is there an alternative I might not have considered? Unfortunately an Add-in is not really an option due to available resources.
You have a couple options here:
Force every user to allow programmatic access to the VBA project for their documents, and use VBA automation to add code. This works nicely when you have programmatic access, but this can be difficult to assure.
If you're not using Visio 2013, you can actually save a document as VDX (xml) and replace the data for the VBA project with your own (you'd save out a document as VDX manually, and copy out the chunk of data for the VBA project). As I said, this wouldn't work with Visio 2013 since they seem to have eliminated the VDX format. You probably can get away with something similar with the VSDX XML format for 2013.
You can 'migrate' everyone's documents to a new VST file you provide. This would just involve copying and pasting all the content from a document into the new document that has your code in it. You have to be careful though to make sure all the document- and page-level data comes along, too (meaning DocumentSheet and PageSheet and any Document XML properties that may be important, and attributes like Author, Description, etc...)
Item 1 is the easiest, aside from the pain getting programmatic access to VBA Projects, unless you can have people send you documents to migrate.
Problem:
embedded a PDF into a Excel workbook.
This PDF is a feedback form which has a send to mail recipient button.
Without embedding it to Excel sending works, when embedded sending causes Excel to crash.
My idea is to open the OLEObject, save the embedded PDF temporary, close the OLEObject
and open the saved PDF so that it runs in a Acrobat instance.
Opening the OLEObject already works by using:
ThisWorkbook.Sheets(SheetName).Shapes(OLE_Name).Select
Selection.Verb Verb:=xlOpen
But im struggling with the following steps.
How to do this?
Possible other ways?
WIN7, Office 2010, Acrobat Reader
OK. OLEObjects are notoriously difficult to work with. I have done some searching because it seems hard to believe this is not supported. This was my first attempt to use a Shell command to copy the OLE and paste it to a known directory.
However, you can't copy/paste a PDF embedded object, which means you can't use a Shell command to copy/paste it to a temporary folder/file path. So that won't work.
The consensus seems to be that this is not really possible; all you can do with embedded PDF is open them. See here and, here which referencing this PDF says:
See the "AxAcroPDFLib.AxAcroPDF" section under OLE Automation. Those are the only methods you have available to you from Reader
And also, here:
What you are doing relies on you knowing how to interact with the particular class which may be OK if you only have a couple of types and know what they are but in general, unless (a) the Class supports Automation and (b) you know the commands to issue this can't work.
The class does not really support the sort of automation you desire.
Possible Solution
Use Acrobat and WinAPI
If you have Acrobat installed, then I believe you may be able to do this possibly using WinAPI to get the hWnd of the Acrobat instance, and control it that way. I have perused some documentation, but I do not have Acrobat and so I am not able to test.
The managing director at our company wants me to produce an automated monthly document that saves to a certain place on our system so that he doesn't have to manually input all of the data. I have set up so that the document can save to the correct place in the correct format but my knowledge of VBA is not great.
Tackling this from a 1 question at a time point of view I suppose my question would be is it possible to create 1 very long macro that will accomplish many different tasks over several workbooks. For example we have a report that comes from our ERP (Baan) and shoves all of the data into one cell. Is it possible to create a macro that will accomplish formatting text to columns, then copy data from a cell based on a row reference and then take said data and paste it in to a different workbook? Would it then be able to save the workbook all from just running one macro and if so how long will all of that take once the macro is executed?
Yes I believe this should be do-able, keeping in mind that the file names + location remain the same (otherwise you'll have to edit each month). Create different Subs/Functions and call them in one main macro.
The easiest way is probably to do it step by step. Record macro's and see whether that already helps you out and if not use google & stackoverflow for help! :)
it is entirely possible - but in my Opinion VBA is not well suited to the task. The editor is atrocius at best and it is easy to produce highly specific "spaghetti code".
File operatione are possible, but are not nice. Error handling is 80s style with lots of goto.
So if you want to build something maintainable, build an external Application using Interop or epplus (.net package for reading /writing to excel documents) or an .net addin for office.