Word VBA - Matching large selecting of text based keys with data. Embedded resource/text? - vba

I have a pretty complex VBA plugin for Word written that automatically creates a report for me, using XML input, cycling through the X objects within the report to create the output. It is currently embedded into a Word Template file .DOCM.
I need to insert into the report a static list of text, based on the name of the item within the XML. For example, within my XML I have entries with a name BLAH1, BLAH2, BLAH3. Every time I see BLAH1, I need to match it with the static INSERT1, and BLAH2 match it with INSERT2, etc.
This seems simple enough, but her lies the problem...
It appears there are no Hashmap's in VBA without requiring external libraries, which I can't really rely on, since I can't install items on the machines where this will be running. As a result I can't store this reference data in a Hashmap as far as I can tell.
I can't seem to concatenate more than about 20 lines of strings together without hitting a max within VBA, and just parsing the chunk of text for what I need since there are about 1500 "lines" in my reference data, which greatly exceeds 20.
I also haven't found a way to embed a text, or any other type of file to hold this information within the file, and then parse the data.
I really would like to have everything within the single template file, without requiring additional text or other files to be bundled with the document. If there is no other option, I will go that route, but I wanted to see what create ideas people at Stackoverflow might have first ;-)

Have you considered using Word's Document Variables? They are name/value pairs stored invisibly within the document. (ActiveDocument.Variables("BLAH1").Value = "INSERT1" to create one, debug.print ActiveDocument.Variables("BLAH1").Value to retrieve a value (you have to use an error handler to detect non-existent indices if you go that route). Word can store (at least) hundreds of thousands of these things).

Related

Word Automation (VBA): Mail Merge Rich Text Format

I'm trying to do a Word MailMerge via VBA from my Access project. I created a clsWordMerge class so I could declare the Word application WithEvents, and take advantage of Word's MailMerge events, mainly the AfterMerge event.
Everything works fine, and I get the finished Word documents created, except that the source fields containing RTF data end up in the document not as formatted text, but instead the RTF codes and data:
<div><font face="Times New Roman" size=3 color=black>This is my <strong><em>test </em></strong>paragraph.</font></div>
Where I would expect to see:
This is my test paragraph
This happens whether I do a mail merge using a CSV file for my data source or an Access table.
So is there any way to correct this, and show the formatted data? I have access to all of the MailMerge events that Word provides.
Thanks..
No, there's no way to merge in RTF and have it display as Word content. RTF is not Word's native file format - a converter is required to display RTF as Word content.
Mail merge literally displays the data text, as it appears in the data source. There are no "advanced features" that enable selectively formatting the mail merge result.
Also, based on painful experience, relying on MailMergeAfterMerge is not advisable. When it was introduced, I tried it, was enthusiastic... until it started failing. The event is unpredictable and not reliable.
Given your requirements, a fully VBA-driven data transfer from Access to Word is a better investment of time and energy.
It probably can be done in certain circumstances, but I agree with Cindy Meister that the Mail Merge Events have not proven reliable (unless they have been fixed - I haven't actually used them for years). The following description of real and likely problems that I have previously encountered when trying this may help:
Not sure any of it can be done if you are merging to Email.
AFAICR the event you are likely to need (MailMergeBeforeRecordMerge) only fires each time Word processes the Main Document, not each time it processes a record in the data source. So if your Mail Merge Main Document "consumes" more than one Data Source record, e.g. because it uses { NEXT } or { NEXTIF } fields, it may be very difficult to get MaiMergeBeforeRecordMerge to do what you need. If I am right about that, that would be enough to put me off making the attempt.
in order to insert your "RTF", you must either
a. Have code that can interpret the "RTF" encoding and do all the right things necessary to insert it in your document, or
b. Have code that saves the "RTF" to an external file, then uses (say) Range.InsertFile to insert it and have Word interpret its contents, or perhaps
c. Use the clipboard to help you do the conversion.
If any of your rich text fields actually contained RTF, (a) would be difficult unless you could find a suitable library to help you. But in fact your sample shows a typical Access rich text field value, which is HTML-like. In fact, I think it is all standard HTML tagging that Word can interpret, but I don't know for sure. That could be much easier to interpret, especially if you only need the plain text (at its simplest, you might be able to throw away the tagging and insert the result.
If your rich text is longer than 255 characters (including the markup), Word's Document.MailMerge.DataSource.DataFields("the case-sensitive field name as Word sees it").Value will be truncated. So if you need the whole of the text or more of it, you'll have to get it somewhere else
The value inserted in the document using a { MERGEFIELD } field is not truncated to 255 characters so you may be able to get the value from the document. Word MailMerge may impose another limit (can't remember, perhaps 64Kb for an OLE DB connection, perhaps less, or perhaps there is a length limit for the data as a whole.
If you can't get the data from the document, you can get it directly from Access. Probably rather easily if your code is running in Access, but it can be done by using ADODB or perhaps ADO from Word VBA code. Your Mail Merge Data Source will need to retrieve the key fields of the record if you want to do that reliably. During development, if your application is running from Access but you are using VBA code in Word, you will probably also need to make sure that you save your Access database each time you modify your Access VBA code, otherwise Access opens the database exclusively and Word won't be able to retrieve data from it.
If you need to use (b) or (c) to save your HTML to a file then you may need to surround the HTML that you get from Access with tags and possibly tags to get Word to recognise the HTML. You could use Scripting.FileSystemObject to save the text, or perhaps ADODB.Stream if you are already using ADODB to retrieve Access data.
You should be able to use VBA Range.InsertFile to insert it, as long as you have some placeholder that tells you what to put it. Or you could use an INCLUDETEXT field and ensure that your Event code updates that field. A snag with the INCLUDETEXT approach is that if you merge to a new document, the INCLUDETEXT fields remain in the document so if you update them, they will all end up with the same result if you do not also create a new file for each source record.
i.e. quite a lot to think about!

Access to Word Template

I am having issues figuring out the best way to do this:
I have a word template for an interview pre-night. What I need to do is fill out the word template with the interviewer and the people who are interviewing them. There will always be 1 interviewee but up to 12 interviewers. The part giving me issue is that there will not always be 12 interviewers so the area that the data is moved to needs to be dynamic. Should I create a table or bookmarks in Word and use VBA to move the data, or design the report in Access? Thanks for your help!
I am going to assume that what you are trying to do is complicated enough that "mail merge" won't work for you.
It really depends on whether you need the end result to be a Word document or an Access report. Both easily convert to PDF for document archiving. If you prefer to work with Word, add the key fields into your document with all the formatting necessary and then use VBA to do a search replace.
Two ways you could go about dealing with the 1-12 interviewers issue:
Use VBA to create one long segment containing all interviewers as a
single "field".
Add 12 sets of key fields and use search/replace
(through VBA of course not manually) to fill in the exiting
interviewer info and delete the key fields for the non-existing ones.

Mass Excel hyperlinks

I've looked around and can't seem to find any help for what I am looking to do.
I have a document that I am using to record data related to repairs and such on machines.
All of my entries are done in a numeric order.
I have to scan hard copies in and hyperlink to them from the excel sheet.
All the files are named to me in a numerical order as well that matches the number in column A.
Is there a way to do this as a formula?
You can use the HYPERLINK formula. The 1st parameter allows you to dynamically build up the URL to point to, so if you say you have the correct numbers already in a column, you can use that to build the URL. Provided the URL for each scanned document can uniquely be derived using only that number, of course.
If the scanned documents are on the web, then you can use e.g.:
=HYPERLINK("https://www.myserver.com/scans/"&A4&".pdf","Scan nr. "&A4)
If they're on your own computer (or on a network drive), then you can use e.g.:
=HYPERLINK("file:///D:/scans/"&A4&".pdf","Scan nr. "&A4)
--- EDIT ---
As Cyril pointed out, you can also just use e.g.:
=HYPERLINK("D:\scans\"&A4&".pdf","Scan nr. "&A4)
which makes it a bit more readable. Also note that Excel really likes to warn you when using these types of links ;)

Automate adding bookmarks to tables and then create an index

I have a program which outputs a collection of tables in a word document which I eventually want to post as an html file with bookmarks and an index. The tables are grouped by "Name:" where there is a 3 row table that contains detailed header information for a section of data, then there is a second table which can span multiple pages which contains the data for that section. There is then a page break so that the next sections header table is on a new page. This can occur for a variable number of sections numbers in the hundreds. I need to write a script that
searches my document for "Name:", which is unique and would not
appear anywhere but the header table,
grabs the text that follows "Name:" within that table cell (for example "Name: Line 1234)
replaces all the blanks in that text string with an underscore to
make it a suitable bookmark name,
creates a bookmark with the name,
goes back and creates an index at the front of the document
Saves the file as an html
I have a passing familiarity with VB for word, I have used it a bit in excel, but am by no means an expert. I would appreciate any advice on functions and objects that I should be using for this script.
Hey MikeV from what I can gather, your problem seems more conceptual, less specific. What I mean is, have you started yet? Or looking at a blank script page?
I'm relatively new to coding, so I get that myself. What I do is make a list of what I need to do (what you have). Then think of the code or psuedo-code that would go with each step. Then you can start to build your script. You don't have to start with step one (as step 2/3 is often the more interesting bit), but let's do that.
Now, you need to search for a text string containing "Name:". I am proficient with VBA in excel, but haven't done anything for word. So I'd look it up. Googling "VBA find word in word document" will bring you to this page, which shows you how to approach step one. So steal their code, alter it to fit your needs and move on to step 2. Repeat the process, and that's how you build your algorithm! :)
Just a FYI, typically StackOverflow is for specific questions with an answer that can be confirmed, whereas you asked for help building an algorithm. I'd reserve those questions for your programming professor or friend who can help.
cheers

how can you parse an excel (.xls) file stored in a varbinary in MS SQL 2005?

problem
how to best parse/access/extract "excel file" data stored as binary data in an SQL 2005 field?
(so all the data can ultimately be stored in other fields of other tables.)
background
basically, our customer is requiring a large volume of verbose data from their users. unfortunately, our customer cannot require any kind of db export from their user. so our customer must supply some sort of UI for their user to enter the data. the UI our customer decided would be acceptable to all of their users was excel as it has a reasonably robust UI. so given all that, and our customer needs this data parsed and stored in their db automatically.
we've tried to convince our customer that the users will do this exactly once and then insist on db export! but the customer can not require db export of their users.
our customer is requiring us to parse an excel file
the customer's users are using excel as the "best" user interface to enter all the required data
the users are given blank excel templates that they must fill out
these templates have a fixed number of uniquely named tabs
these templates have a number of fixed areas (cells) that must be completed
these templates also have areas where the user will insert up to thousands of identically formatted rows
when complete, the excel file is submitted from the user by standard html file upload
our customer stores this file raw into their SQL database
given
a standard excel (".xls") file (native format, not comma or tab separated)
file is stored raw in a varbinary(max) SQL 2005 field
excel file data may not necessarily be "uniform" between rows -- i.e., we can't just assume one column is all the same data type (e.g., there may be row headers, column headers, empty cells, different "formats", ...)
requirements
code completely within SQL 2005 (stored procedures, SSIS?)
be able to access values on any worksheet (tab)
be able to access values in any cell (no formula data or dereferencing needed)
cell values must not be assumed to be "uniform" between rows -- i.e., we can't just assume one column is all the same data type (e.g., there may be row headers, column headers, empty cells, formulas, different "formats", ...)
preferences
no filesystem access (no writing temporary .xls files)
retrieve values in defined format (e.g., actual date value instead of a raw number like 39876)
My thought is that anything can be done, but there is a price to pay. In this particular case, the price seems to bee too high.
I don't have a tested solution for you, but I can share how I would give my first try on a problem like that.
My first approach would be to install excel on the SqlServer machine and code some assemblies to consume the file on your rows using excel API and then load them on Sql server as assembly procedures.
As I said, This is just a idea, I don't have details, but I'm sure others here can complement or criticize my idea.
But my real advice is to rethink the whole project. It makes no sense to read tabular data on binary files stored on a cell of a row of a table on database.
This looks like an "I wouldn't start from here" kind of a question.
The "install Excel on the server and start coding" answer looks like the only route, but it simply has to be worth exploring alternatives first: it's going to be painful, expensive and time-consuming.
I strongly feel that we're looking at a "requirement" that is the answer to the wrong problem.
What business problem is creating this need? What's driving that? Try the Five Whys as a possible way to explore the history.
It sounds like you're trying to store an entire database table inside a spreadsheet and then inside a single table's field. Wouldn't it be simpler to store the data in a database table to begin with and then export it as an XLS when required?
Without opening up an instance Excel and having Excel resolve worksheet references I'm not sure it's doable at all.
Could you write the varbinary to a Raw File Destination? And then use an Excel Source as your input to whatever step is next in your precedence constraints.
I haven't tried it, but that's what I would try.
Well, the whole setup seems a bit twisted :-) as others have already pointed out.
If you really cannot change the requirements and the whole setup: why don't you explore components such as Aspose.Cells or Syncfusion XlsIO, native .NET components, that allow you to read and interpret native Excel (XLS) files. I'm pretty such with either of the two, you should be able to read your binary Excel into a MemoryStream and then feed that into one of those Excel-reading components, and off you go.
So with a bit of .NET development and SQL CLR, I guess this should be doable - not sure if it's the best way to do it, but it should work.