Combine several Excel files into one using VB - vb.net

I have a folder with several excel files that have a date field, i.e. 08-24-2010-123320564.xls. I want to be able to have some VB scripting that will simply take the files that start with todays date and merge them into one file.
08-24-2010-123320564.xls
08-24-2010-123440735.xls
08-24-2010-131450342.xls
into
08-24-2010.xls
Can someone please help?
Thanks
GabrielVA

assuming you just want to append rows in simple spreadsheets, follow this logic:
Psuedocode
use an excel macro
(you could just as well automate excel from vb but why vbscript alone since you need excel anyway?)
have it to a dir listing (dir function)
dim a date_start variable init to "new"
dim a merged_spreadsheet as new doc default to nothing
loop thru result of dir
if date_start <> start of filename
if merged_spreadsheet is not nothing
save it
set it to nothing
store start of date (left mid function) in date_start
if merged_spreadsheet is nothing
make a new one
open the file from the dir command's loop
select all the data
copy it
go to first empty row in merged_spreadsheet
paste it
loop files
if merged_spreadsheet is not nothing
save it
If you're not happy with all those 'nothings', you can set a separate flag to keep track of whether you have a merged_spreadsheet or not. Think about what happens for just one file in a date, no files at all, etc.
Of course you will tear out your hair finding out how to automate those excel functions. The secret to turning 'hard' into 'pretty darn easy' is this:
macro recorder will reveal the automation commands
They are not intuitive. So record a macro. Then do things that you'll need to do in your code. Stop recording and look at the result.
For example:
* load/save files
* select only entered fields
* Select all
* copy / switch files / paste
* create new sheet
* click in various single cells and type (how to examine/set a cell's contents)
In summary-
(1) Know exactly the steps to what you're doing
(2) Use macro recorder to give away the secrets of the excel object model. Steal its secrets.
Really this won't be all that hard if you marry these two concepts cleverly. Since the macro will be vbscript (at least if you use office 97 ;-) you can probably run it from vbs or vb6 if you want. A hop to vb.net shouldn't be that hard either.

Aspose makes it pretty easy to work with excel-files in .NET http://www.aspose.com

Related

Excel 2007 VBA use string variable value in object variable

Good afternoon guys!
I looked around, but I'm not finding anything that addresses my particular issue, so I'll do my best to explain.
Specs:
Excel 2007, VBA, Outlook 2007
Okay, so I've been reworking some scripts that I created to automate our reports in Excel / Avaya CMS. For the past few years I've been the sole person doing our reports, so it wasn't ever necessary to set things up on anyone else's computers. However, things are changing and I'm in the process of updating the scripts / training others to use them. At the moment, when I put my scripts on their computer, I have to go into the VBA code and manually set every reference to their own relative folder paths / outlook folder paths. Painless enough, except when I do any type of changes to the scripts and have to go through the whole process again on each computer.
So this was my solution: Create a config worksheet on the automated reporting workbook, on that config worksheet store the file paths, and in the code simply use variables to reference the config worksheet. This should make it as easy as setting the variables once from the config on a new computer without having to touch any of the lines of code.
Problem: At midnight there is data that is emailed to us from another office. We use Outlook at this office, so I've simply been having it go to the folder specified in the scripts, download the attachments, and then use the downloaded data for the reports. Since everyone sets up their own outlook folders, the paths inside of outlook are different for each user. Since VBA is accessing a worksheet to grab the config information from a cell, it's returning the path for Outlook folders as a string value. However, the outlook folder variable is of Object type, and so it doesn't allow me to use the string variable as it's value, even though the string itself is the actual value the object needs (just not as a string). So is it possible to convert the string value to a value that can be used in the Object variable?
Worksheet Config Cell Value (B5)- The String Value
outNamespace.Folders("Mailbox - Some Guy").Folders("Reports").Folders("ImportantData")
Code:
'Config tab
Dim serverConfig As Worksheet
Set serverConfig = Sheets("CONFIG")
Dim dirOutlookData As String
dirOutlookData = serverConfig.Range("B5").Value
Dim outFolder As Object
Set outFolder = dirOutlookData
Any ideas? Since the string that's returning for dirOutlookData is the value needed for the object variable value, how can I convert the value from the string variable so it can be used in such a way?
Thanks in advance.
Seems too convoluted a solution. Nobody could share their VBA.
In the example provided
outNamespace.Folders("Mailbox - Some Guy").Folders("Reports").Folders("ImportantData"
you could instead use this directly in the VBA
Set mailboxFolder = outNamespace.GetDefaultFolder(olFolderInbox).Parent
You now have the "Mailbox - Some Guy" folder in a generic way not needing to specify "Some Guy".
Set myFolder = mailboxFolder.Folders("Reports").Folders("ImportantData")

Splitting MS Publisher 2010 document into multiple files

I want to split a multi-page MS Publisher 2010 document into a set of separate documents, one per page.
The starting document is from a mail-merge, and I am trying to produce a set of numbered and named tickets as PDFs to send to people for an event (this is for a charity). The mail-merge seems to work fine and I can save the merged document and it looks OK with e.g. a list of fifty people giving me a 50-page document.
Ideally the result would be a set of PDFs.
I have tried to create some simple VBA code to do this, but it is not working consistently. If I try this very simple macro below , I get the correct number of documents, but only perhaps 1 or 2 documents with the correct contents out of every five. Most of the documents are completely empty.
Sub splitter()
Dim i As Integer
Dim Source As Document
Dim Target As Document
Set Source = ActiveDocument
For i = 1 To Source.Pages.Count
Set Target = Documents.Add
Source.Pages(i).Shapes.Range.Copy
Target.Pages(1).Shapes.Paste
Target.SaveAs Filename:="C:\Temp\Ticket_" & i
Target.Close
Set Target = Nothing
Next i
End Sub
I did sometimes get an error that the clipboard is busy, but not always.
Another approach might be to start with the master document and do this looping over the separate documents and fill in the personal details for each person's ticket and directly produce the PDFs. But that seems more complex, and I am not a VB programmer (but been doing C++ etc for 20+ years, so I can program :-) )
A final annoyance is that it seems to keep opening a new Publisher window for each document. It takes a while to then close 50+ copies of publisher, and the laptop starts to crawl...
Please advise how best to get round these issues. I am probably missing something trivial, being a relative VB(A) newbie.
Thanks in advance for any suggestions
Try coding something like this:
Open Publisher application (CreateObject()?)
Open Publisher document (doc.Open(filename))
Store the total amount of pages in a global variable (doc.Pages.Count)
Close document (doc.Close())
Loop the following for each page
Copy the pub file and rename it to name & "page" & X
Open the new pub file
Remove all Pages except page X from the pub file
doc.Save()
doc.Close()
Copying files with VBA is easy, but copying pages in Publisher VBA is quite a hassle, so this should be easier to achieve

Delete top row of excel file using SSIS

I have an excel file that has a header row which is a row that I want to delete. The header row in thsi file are the cells of A1 to W1 merged into one. This causes a problem when I try to read the file because I am expecting column names. Proper column names exist in the second row of the file, which is why I want to delete the first.
To accomplish this I thought I'd be able to use the 'Excel Source' item in SSIS since it supports a SQL option to write a query. What I want to do is something like this:
SELECT * from ExcelFile WHERE Row > 1
My file only has data in columns A thru W.
I don't know what syntax I can use in the query to do this. The query builder that is in the Excel Source item will allow me to do many things with columns but I don't see an option for doing anything with rows. Searching online and using the help didn't get me anywhere.
None of these solutions will work because the Excel driver will be confused by the merged first line. You won't be able to use any driver features such as skip first row to do this. You need to run some script to open the Excel file and delete the row manually.
There is some basic sample script at this site:
http://www.sqlservercentral.com/Forums/Topic1327014-1292-1.aspx
The code below is adapted from the code written by snsingh at that site.
You would obviously want to use connnection manager properties, not hard coded paths
Excel needs to be installed on the SSIS Server for it to work - this is the only way to use Excel automation.
Dim filename As String
Dim appExcel As Object
Dim newBook As Object
Dim oSheet1 As Object
appExcel = CreateObject("Excel.Application")
filename = "C:\test.xls"
appExcel.DisplayAlerts = False
newBook = appExcel.Workbooks.Open(filename)
oSheet1 = newBook.worksheets("Sheet1")
oSheet1.Range("A1").Entirerow.Delete()
newBook.SaveAs(filename, FileFormat:=56)
appExcel.Workbooks.Close()
appExcel.Quit()
You don't need to use a syntax.
Go to control flow..
Pull in a data flow task.
Add a excel file source...add a conection manager
With excel sheet.
Open your connection manager and then check the box which says.
Column names In first row. That's it and add ur destination.

Merging Documents with Open XML

I am looking for mail merge alternatives in my vb.net app. I have used the mail merge feature of word, and find that it is quite buggy when dealing with a large volume of documents. I am looking at alternate methods of generating the merge, and have come across open xml. I think this will probably be the answer I am looking for. I have come to understand that the merge will be entirely code-driven in vb.net. I have started playing around with the following code:
Dim wordprocessingDocument As WordprocessingDocument = wordprocessingDocument.Open("C:\Users\JasonB\Documents\test.docx", True)
'for each simplefield (mergefield)
For Each field In wordprocessingDocument.MainDocumentPart.Document.Body.Descendants(Of SimpleField)()
'get the document instruction values
Dim instruction As String() = field.Instruction.Value.Split(splitChar, StringSplitOptions.RemoveEmptyEntries)
'if mergefield
If instruction(0).ToLower.Equals("mergefield") Then
Dim fieldname As String = instruction(1)
For Each fieldtext In field.Descendants(Of Text)()
fieldtext.Text = "I AM TESTING"
Next
End If
wordprocessingDocument.MainDocumentPart.Document.Save()
wordprocessingDocument.Dispose()
Now this works great and all, but I am realizing that I need to create as many documents as I will have datarows (assuming I use a datatable to handle the data).
One suggestion I found was to loop through each datarow, take my document template, save it to a folder and insert the datarow data. This could mean however that I end up with 12,000 documents in a single folder that need to be joined later and converted to pdf.
Is there another option? The other thing that stood out to me is to create a new word document, and duplicate over the xml from the template, and then replace the values. I dont know however if there is a "simpler" way of doing this, thanks.
If you don't want to save all 12,000 documents to file you should be able to process, convert and email them one at a time using temporary files.
Converting the DOCX to PDF in .NET might be an issue but looks like it's possible using Word Automation (Saving Word DOCX files as PDF).
The bottom line is you don't need to generate all documents before emailing them if you perform the process one document at a time. You can use SmtpClient in VB.NET to email the PDF after it is generated.
In terms of creating the document I have seen reports generated where a simple string replace is used to replace a string such as '%FIRSTNAME%' with the person's name and so on. This isn't necessarily the best approach but can work quite well. This way you can create your template in Word or OpenOffice and then edit it in .NET using OpenXML.

Batch add a macro to word documents?

I have several hundred .doc word documents to which I need to add a macro which runs when the .doc file is opened and creates a header for said document based on the file name. Is there a way to do this as a batch? I have been individually opening each document and going into visual basic --> Project --> This Document then inserting a .txt file which contains the code. Is there a fast way to do this for multiple documents?
As a learning exercise, put this into the "ThisDocument" part of Normal (the Normal.dot template) in the VBE
Open a word document and watch what happens.
I don't think you need to put your code in every single file, I think you should be OK with using the Document_Open event in Normal.dot.
Just make sure it shows up as a reference in your word documents that you open but I don't see why it wouldn't
If you absolutely need it in every file then it can be done but the problem is if you make one small change to the code, you have to go through all this again. The idea with code is to write it once, use it many times.
You can write VBA code that alters the VBA code in other documents, but you need to "Trust access to the VBA project object model" in the Trust Centre options. This could open you up to viral code if you download Word documents with malicious VBA code in them. What you want to do, essentially, is write a VBA virus. There are legitimate reasons for doing this, and also malicious ones, I leave the ethics of the uses of these techniques up to the user. Knowledge itself is not malicious.
Here's the meat, you will need to write your own code to loop through the documents and possibly save them as .docm files.
Sub ReplaceCode()
Set oDoc = ActiveDocument
Set oComponents = oDoc.VBProject.VBComponents
For i = oComponents.Count To 1 Step -1
If oComponents(i).Type = 100 And oComponents(i).Name = "ThisDocument" Then
With oComponents(i).CodeModule
.DeleteLines 1, .CountOfLines
.AddFromFile "C:\ThisDocument.cls"
End With
End If
Next i
End Sub
Also, if you create your code file by exporting from VBA, you will need to remove this from the top of the .cls file:
VERSION 1.0 CLASS
BEGIN
MultiUse = -1 'True
END
Personally, I would drive this from Excel, maybe using a worksheet to hold a list of the files or locations to update, and another sheet for the code to populate with a list of files updated.