I am writing a function to export data to Excel using the Office Interop in VB .NET. I am currently writing the cells directly using the Excel worksheet's Cells() method:
worksheet.Cells(rowIndex, colIndex) = data(rowIndex)(colIndex)
This is taking a long time for large amounts of data. Is there a faster way to write a lot of data to Excel at once? Would doing something with ranges be faster?
You should avoid reading and writing cell by cell if you can. It is much faster to work with arrays, and read or write entire blocks at once. I wrote a post a while back on reading from worksheets using C#; basically, the same code works the other way around (see below), and will run much faster, especially with larger blocks of data.
var sheet = (Worksheet)Application.ActiveSheet;
var range = sheet.get_Range("A1", "B2");
var data = new string[3,3];
data[0, 0] = "A1";
data[0, 1] = "B1";
data[1, 0] = "A2";
data[1, 1] = "B2";
range.Value2 = data;
If you haven't already, make sure to set Application.ScreenUpdating = false before you start to output your data. This will make things go much faster. The set it back to True when you are done outputting your data. Having to redraw the screen on each cell change takes a good bit of time, bypassing this saves that.
As for using ranges, you still will need to target 1 (one) specific cell for a value, so I see no benefit here. I am not aware of doing this any faster than what you are doing in regards to actually outputting the data.
Just to add to Tommy's answer.
You might also want to set the calculation to manual before you start writing.
Application.Calculation =
xlCalculationManual
And set it back to automatic when you're done with your writing. (if there's a chance that the original mode could have been anything other than automatic, you will have to store that value before setting it to manual)
Application.Calculation =
xlCalculationAutomatic
You could also use the CopyFromRecordset method of the Range object.
http://msdn.microsoft.com/de-de/library/microsoft.office.interop.excel.range.copyfromrecordset(office.11).aspx
The fastest way to write and read values from excel ranges is Range.get_Value and Range.set_Value.
The way is as below:
Range filledRange = Worksheet.get_Range("A1:Z678",Missing);
object[,] rngval = (object[,]) filledRange.get_Value (XlRangeValueDataType.xlRangeValueDefault);
Range Destination = Worksheet2.get_Range("A1:Z678",Missing);
destination.set_Value(Missing,rngval);
and yes, no iteration required. Performance is just voila!!
Hope it helps !!
Honestly, the fastest way to write it is with comma delimiters. It's easier to write a line of fields using the Join(",").ToString method instead of trying to iterate through cells. Then save the file as ".csv". Using interop, open the file as a csv which will automatically do the cell update for you upon open.
In case someone else comes along like me looking for a full solution using the method given by #Mathias (which seems to be the fastest for loading into Excel) with #IMil's suggestion on the Array.
Here you go:
'dt (DataTable) is the already populated DataTable
'myExcelWorksheet (Worksheet) is the worksheet we are populating
'rowNum (Integer) is the row we want to start from (usually 1)
Dim misValue As Object = System.Reflection.Missing.Value
Dim arr As Object = DataTableToArray(dt)
'Char 65 is the letter "A"
Dim RangeTopLeft As String = Convert.ToChar(65 + 0).ToString() + rowNum.ToString()
Dim RangeBottomRight As String = Convert.ToChar(65 + dt.Columns.Count - 1).ToString() + (rowNum + dt.Rows.Count - 1).ToString()
Dim Range As String = RangeTopLeft + ":" + RangeBottomRight
myExcelWorksheet.Range(Range, misValue).NumberFormat = "#" 'Include this line to format all cells as type "Text" (optional step)
'Assign to the worksheet
myExcelWorksheet.Range(Range, misValue).Value2 = arr
Then
Function DataTableToArray(dt As DataTable) As Object
Dim arr As Object = Array.CreateInstance(GetType(Object), New Integer() {dt.Rows.Count, dt.Columns.Count})
For nRow As Integer = 0 To dt.Rows.Count - 1
For nCol As Integer = 0 To dt.Columns.Count - 1
arr(nRow, nCol) = dt.Rows(nRow).Item(nCol).ToString()
Next
Next
Return arr
End Function
Limitations include only allowing 26 columns before it would need better code for coming up with the range value letters.
Related
I have a large worksheet (~250K rows, 22 columns, ~40MB plain data) which has to transfer its content to an intranet API. Format does not matter. The problem is: When accessing the data like
Const ROWS = 250000
Const COLS = 22
Dim x As Long, y As Long
Dim myRange As Variant
Dim dummyString As String
Dim sb As New cStringBuilder
myRange = Range(Cells(1, 1), Cells(ROWS, COLS)).Value2
For x = 1 To ROWS
For y = 1 To COLS
dummyString = myRange(x, y) 'Runtime with only this line: 1.8s
sb.Append dummyString 'Runtime with this additional line 163s
Next
Next
I get a wonderful 2D array, but I am not able to collect the data efficiently for HTTP export.
An X/Y loop over the array and access myRange[x, y] has runtimes >1min. I was not able to find an array method which helps to get the imploded/encoded content of the 2D array.
My current workaround is missusing the clipboard (Workaround for Memory Leak when using large string) which works fast, but is a dirty workaround in my eyes AND has one major problem: The values I get are formatted, “.Value” and not “.Value2”, so I have to convert the data on server site again before usage, e.g. unformat currency cells to floats.
What could be another idea to deal with the data array?
My thoughts are that you create two string arrays A and B. A can be of size 1 to ROWS, B can be of size of 1 to COLUMNS. As you loop over each row in your myRange array, fill each element in B with each column's value in that row. After the final column for that row and before you move to the next row, join array B and assign to the row in A. With a loop of this size, only put necessary stuff inside the loop itself. At the end you would join A. You might need to use cstr() when assigning items to B.
Matschek (OP) was able to write the code based on the above, but for anyone else's benefit, the code itself might be something like:
Option Explicit
Private Sub concatenateArrayValues()
Const TOTAL_ROWS As Long = 250000
Const TOTAL_COLUMNS As Long = 22
Dim inputValues As Variant
inputValues = ThisWorkbook.Worksheets("Sheet1").Range("A1").Resize(TOTAL_ROWS, TOTAL_COLUMNS).Value2
' These are static string arrays, as OP's use case involved constants.
Dim outputArray(1 To TOTAL_ROWS) As String ' <- in other words, array A
Dim interimArray(1 To TOTAL_COLUMNS) As String ' <- in other words, array B
Dim rowIndex As Long
Dim columnIndex As Long
' We use constants below when specifying the loop's limits instead of Lbound() and Ubound()
' as OP's use case involved constants.
' If we were using dynamic arrays, we could call Ubound(inputValues,2) once outside of the loop
' And assign the result to a Long type variable
' To avoid calling Ubound() 250k times within the loop itself.
For rowIndex = 1 To TOTAL_ROWS
For columnIndex = 1 To TOTAL_COLUMNS
interimArray(columnIndex) = inputValues(rowIndex, columnIndex)
Next columnIndex
outputArray(rowIndex) = VBA.Strings.Join(interimArray, ",")
Next rowIndex
Dim concatenatedOutput As String
concatenatedOutput = VBA.Strings.Join(outputArray, vbNewLine)
Debug.Print concatenatedOutput
' My current machine isn't particularly great
' but the code above ran and concatenated values in range A1:V250000
' (with each cell containing a random 3-character string) in under 4 seconds.
End Sub
I have worked in Python earlier where it is really smooth to have a dictionary of lists (i.e. one key corresponds to a list of stuff). I am struggling to achieve the same in vba. Say I have the following data in an excel sheet:
Flanged_connections 6
Flanged_connections 8
Flanged_connections 10
Instrument Pressure
Instrument Temperature
Instrument Bridle
Instrument Others
Piping 1
Piping 2
Piping 3
Now I want to read the data and store it in a dictionary where the keys are Flanged_connections, Instrument and Piping and the values are the corresponding ones in the second column. I want the data to look like this:
'key' 'values':
'Flanged_connections' '[6 8 10]'
'Instrument' '["Pressure" "Temperature" "Bridle" "Others"]'
'Piping' '[1 2 3]'
and then being able to get the list by doing dict.Item("Piping") with the list [1 2 3] as the result. So I started thinking doing something like:
For Each row In inputRange.Rows
If Not equipmentDictionary.Exists(row.Cells(equipmentCol).Text) Then
equipmentDictionary.Add row.Cells(equipmentCol).Text, <INSERT NEW LIST>
Else
equipmentDictionary.Add row.Cells(equipmentCol).Text, <ADD TO EXISTING LIST>
End If
Next
This seems a bit tedious to do. Is there a better approach to this? I tried searching for using arrays in vba and it seems a bit different than java, c++ and python, with stuft like redim preserve and the likes. Is this the only way to work with arrays in vba?
My solution:
Based on #varocarbas' comment I have created a dictionary of collections. This is the easiest way for my mind to comprehend what's going on, though it might not be the most efficient. The other solutions would probably work as well (not tested by me). This is my suggested solution and it provides the correct output:
'/--------------------------------------\'
'| Sets up the dictionary for equipment |'
'\--------------------------------------/'
inputRowMin = 1
inputRowMax = 173
inputColMin = 1
inputColMax = 2
equipmentCol = 1
dimensionCol = 2
Set equipmentDictionary = CreateObject("Scripting.Dictionary")
Set inputSheet = Application.Sheets(inputSheetName)
Set inputRange = Range(Cells(inputRowMin, inputColMin), Cells(inputRowMax, inputColMax))
Set equipmentCollection = New Collection
For i = 1 To inputRange.Height
thisEquipment = inputRange(i, equipmentCol).Text
nextEquipment = inputRange(i + 1, equipmentCol).Text
thisDimension = inputRange(i, dimensionCol).Text
'The Strings are equal - add thisEquipment to collection and continue
If (StrComp(thisEquipment, nextEquipment, vbTextCompare) = 0) Then
equipmentCollection.Add thisDimension
'The Strings are not equal - add thisEquipment to collection and the collection to the dictionary
Else
equipmentCollection.Add thisDimension
equipmentDictionary.Add thisEquipment, equipmentCollection
Set equipmentCollection = New Collection
End If
Next
'Check input
Dim tmpCollection As Collection
For Each key In equipmentDictionary.Keys
Debug.Print "--------------" & key & "---------------"
Set tmpCollection = equipmentDictionary.Item(key)
For i = 1 To tmpCollection.Count
Debug.Print tmpCollection.Item(i)
Next
Next
Note that this solution assumes that all the equipment are sorted!
Arrays in VBA are more or less like everywhere else with various peculiarities:
Redimensioning an array is possible (although not required).
Most of the array properties (e.g., Sheets array in a Workbook) are 1-based. Although, as rightly pointed out by #TimWilliams, the user-defined arrays are actually 0-based. The array below defines a string array with a length of 11 (10 indicates the upper position).
Other than that and the peculiarities regarding notations, you shouldn't find any problem to deal with VBA arrays.
Dim stringArray(10) As String
stringArray(1) = "first val"
stringArray(2) = "second val"
'etc.
Regarding what you are requesting, you can create a dictionary in VBA and include a list on it (or the VBA equivalent: Collection), here you have a sample code:
Set dict = CreateObject("Scripting.Dictionary")
Set coll = New Collection
coll.Add ("coll1")
coll.Add ("coll2")
coll.Add ("coll3")
If Not dict.Exists("dict1") Then
dict.Add "dict1", coll
End If
Dim curVal As String: curVal = dict("dict1")(3) '-> "coll3"
Set dict = Nothing
You can have dictionaries within dictionaries. No need to use arrays or collections unless you have a specific need to.
Sub FillNestedDictionairies()
Dim dcParent As Scripting.Dictionary
Dim dcChild As Scripting.Dictionary
Dim rCell As Range
Dim vaSplit As Variant
Dim vParentKey As Variant, vChildKey As Variant
Set dcParent = New Scripting.Dictionary
'Don't use currentregion if you have adjacent data
For Each rCell In Sheet2.Range("A1").CurrentRegion.Cells
'assume the text is separated by a space
vaSplit = Split(rCell.Value, Space(1))
'If it's already there, set the child to what's there
If dcParent.Exists(vaSplit(0)) Then
Set dcChild = dcParent.Item(vaSplit(0))
Else 'create a new child
Set dcChild = New Scripting.Dictionary
dcParent.Add vaSplit(0), dcChild
End If
'Assumes unique post-space data - text for Exists if that's not the case
dcChild.Add CStr(vaSplit(1)), vaSplit(1)
Next rCell
'Output to prove it works
For Each vParentKey In dcParent.Keys
For Each vChildKey In dcParent.Item(vParentKey).Keys
Debug.Print vParentKey, vChildKey
Next vChildKey
Next vParentKey
End Sub
I am not that familiar with C++ and Python (been a long time) so I can't really speak to the differences with VBA, but I can say that working with Arrays in VBA is not especially complicated.
In my own humble opinion, the best way to work with dynamic arrays in VBA is to Dimension it to a large number, and shrink it when you are done adding elements to it. Indeed, Redim Preserve, where you redimension the array while saving the values, has a HUGE performance cost. You should NEVER use Redim Preserve inside a loop, the execution would be painfully slow
Adapt the following piece of code, given as an example:
Sub CreateArrays()
Dim wS As Worksheet
Set wS = ActiveSheet
Dim Flanged_connections()
ReDim Flanged_connections(WorksheetFunction.CountIf(wS.Columns(1), _
"Flanged_connections"))
For i = 1 To wS.Cells(1, 1).CurrentRegion.Rows.Count Step 1
If UCase(wS.Cells(i, 1).Value) = "FLANGED_CONNECTIONS" Then ' UCASE = Capitalize everything
Flanged_connections(c1) = wS.Cells(i, 2).Value
End If
Next i
End Sub
I am trying to import a spreadsheet that has a question on each row along with 4 possible answers. I can successfully read the cell values, but the correct answer is indicated by a fill pattern (50% Gray). I am using the code below to loop through the worksheet and pick out the correct answers. However, the value of Pattern seems to be the same for all columns, even though the pattern is plainly visible on the worksheet. Am I looking in the wrong place?
The worksheet is an .xls file. I am using Excel 2010 and VS 2010.
Dim dt As New System.Data.DataTable
Dim wks As Worksheet = wkb.Worksheets(1)
Dim ur As Range = wks.UsedRange
' Load all cells into an array.
Dim SheetData(,) As Object = ur.Value(XlRangeValueDataType.xlRangeValueDefault)
' Loop through all cells.
For j As Integer = 1 To SheetData.GetUpperBound(0)
For k As Integer = 1 To (SheetData.GetUpperBound(1) - 1)
'Get the pattern for the cells in columns 7 - 10
If (k > 6) And (k < 11) Then
Dim r As Range = wks.Cells(j, k)
Dim s As Style = r.Style
If s.Interior.Pattern = XlPattern.xlPatternGray50 Then
'Convert column index to "A" - "D"
Dim key As Char = ChrW(k + 58)
'Do something with key
End If
End If
Next
Next
I looked in MSDN but they give little or no explanation of how styles are stored in the object model. The few examples that I have seen show using the Style.Interior.Pattern to set the value after selecting the cell. Do I need to select the cell to read the pattern?
Any help would be appreciated.
Interior.Pattern is accessible from a Range object. The Range interface/object contains all the style and value information for the range.
The Style property of the Range object refers to Styles that are definied at workbook level ("shared" styles).
From MSDN:
The Style object contains all style attributes (font, number format, alignment, and so on) as properties. There are several built-in styles, including Normal, Currency, and
Percent. Using the Style object is a fast and efficient way to change
several cell-formatting properties on multiple cells at the same time.
In your case (where the styles seems to be definied at cell level and not by using a "shared" style), you just need to replace:
Dim r As Range = wks.Cells(j, k)
Dim s As Style = r.Style
If s.Interior.Pattern = XlPattern.xlPatternGray50 Then
With:
Dim r As Range = wks.Cells(j, k)
'Dim s As Style = r.Style 'no need
If r.Interior.Pattern = XlPattern.xlPatternGray50 Then
before I start I want to point out that I tagged this question as VBA because I can't actually make a new tag for Winwrap and I've been told that Winwrap is pretty much the same as VBA.
I'm working on SPSS V19.0 and I'm trying to make a code that will help me identify and assign value labels to all values that don't have a label in the specified variable (or all variables).
The pseudo code below is for the version where it's a single variable (perhaps inputted by a text box or maybe sent via a custom dialogue in the SPSS Stats program (call the .sbs file from the syntax giving it the variable name).
Here is the Pseudo Code:
Sub Main(variable As String)
On Error GoTo bye
'Variable Declaration:
Dim i As Integer, intCount As Integer
Dim strValName As String, strVar As String, strCom As String
Dim varLabels As Variant 'This should be an array of all the value labels in the selected record
Dim objSpssApp As 'No idea what to put here, but I want to select the spss main window.
'Original Idea was to use two loops
'The first loop would fill an array with the value lables and use the index as the value and
'The second loop would check to see which values already had labels and then
'Would ask the user for a value label to apply to each value that didn't.
'loop 1
'For i = 0 To -1
'current = GetObject(variable.valuelist(i)) 'would use this to get the value
'Set varLabels(i) = current
'Next
'Loop for each number in the Value list.
strValName = InputBox("Please specify the variable.")
'Loop for each number in the Value list.
For i = 0 To varLabels-1
If IsEmpty (varLabels(i)) Then
'Find value and ask for the current value label
strVar = InputBox("Please insert Label for value "; varLabels(i);" :","Insert Value Label")
'Apply the response to the required number
strCom = "ADD VALUE LABELS " & strVar & Chr$(39) & intCount & Chr$(39) & Chr$(39) & strValName & Chr$(39) &" ."
'Then the piece of code to execute the Syntax
objSpssApp.ExecuteCommands(strCom, False)
End If
'intCount = intCount + 1 'increase the count so that it shows the correct number
'it's out of the loop so that even filled value labels are counted
'Perhaps this method would be better?
Next
Bye:
End Sub
This is in no way functioning code, it's just basically pseudo code for the process that I want to achieve I'm just looking for some help on it, if you could that would be magic.
Many thanks in advance
Mav
Winwrap and VBA are almost identical with differences that you can find in this post:
http://www.winwrap.com/web/basic/reference/?p=doc_tn0143_technote.htm
I haven't used winwrap, but I'll try to answer with my knowledge from VBA.
Dim varLabels As Variant
You can make an array out of this by saying for example
dim varLabels() as variant 'Dynamically declared array
dim varLabels(10) as variant 'Statically declared array
dim varLabels(1 to 10) as variant 'Array starting from 1 - which I mostly use
dim varLabels(1 to 10, 1 to 3) 'Multidimensional array
Dim objSpssApp As ?
"In theory", you can leave this as a variant type or even do
Dim objSpssApp
Without further declaration, which is basically the same - and it will work because a variant can be anything and will not generate an error. It is good custom though to declare you objects according to an explicit datatype in because the variant type is expensive in terms of memory. You should actually find out about the objects class name, but I cannot give you this. I guess that you should do something like:
set objSpssApp = new <Spss Window>
set objSpssApp = nothing 'In the end to release the object
Code:
'loop 1
For i = 0 To -1
current = GetObject(variable.valuelist(i)) 'would use this to get the value
Set varLabels(i) = current
Next
I don't exactly know why you want to count from 0 to -1 but perhaps it is irrelevant.
To fill an array, you can just do: varLabels(i) = i
The SET statement is used to set objects and you don't need to create an object to create an array. Also note that you did not declare half of the variables used here.
Code:
strVar = InputBox("Please insert Label for value "; varLabels(i);" :","Insert Value Label")
Note that the concatenation operator syntax is &.
This appears to be the same in WinWrap:
http://www.winwrap.com/web/basic/language/?p=doc_operators_oper.htm
But you know this, since you use it in your code.
Code:
'intCount = intCount + 1 'increase the count so that it shows the correct number
'it's out of the loop so that even filled value labels are counted
'Perhaps this method would be better?
I'm not sure if I understand this question, but in theory all loops are valid in any situation, it depends on your preference. For ... Next, Do ... Loop, While ... Wend, in the end they all do basically the same thing. intCount = intCount + 1 seems valid when using it in a loop.
Using Next (for ... next)
When using a counter, always use Next iCounter because it increments the counter.
I hope this reply may be of some use to you!
I have the following Data in Excel.
CHM0123456 SRM0123:01
CHM0123456 SRM0123:02
CHM0123456 SRM0256:12
CHM0123456 SRM0123:03
CHM0123457 SRM0789:01
CHM0123457 SRM0789:02
CHM0123457 SRM0789:03
CHM0123457 SRM0789:04
What I need to do is pull out all the relevent SRM numbers that relate to a single CHM ref. now I have a formular that will do some thing like this
=INDEX($C$2:$C$6, SMALL(IF($B$8=$B$2:$B$6, ROW($B$2:$B$6)-MIN(ROW($B$2:$B$6))+1, ""), ROW(A1)))
however this is a bit untidy and I really want to produce this same using short vb script, do i jsut have to right a loop that will run though and check each row in turn.
For x = 1 to 6555
if Ax = Chm123456
string = string + Bx
else
next
which should give me a final string of
SRM0123:01,SRM123:02,SRM0256:12,SRM0123:03
to use with how i want.
Or is ther a neater way to do this ?
Cheers
Aaron
my current code
For x = 2 To 6555
If Cells(x, 1).Value = "CHM0123456" Then
outstring = outstring + vbCr + Cells(x, 2).Value
End If
Next
MsgBox (outstring)
End Function
I'm not sure what your definition of 'neat' is, but here is a VBA function that I consider very neat and also flexible and it's lightning fast (10k+ entires with no lag). You pass it the CHM you want to look for, then the range to look in. You can pass a third optional paramater to set how each entry is seperated. So in your case you could write (assuming your list is :
=ListUnique(B2, B2:B6555)
You can also use Char(10) as the third parameter to seperat by line breaks, etc.
Function ListUnique(ByVal search_text As String, _
ByVal cell_range As range, _
Optional seperator As String = ", ") As String
Application.ScreenUpdating = False
Dim result As String
Dim i as Long
Dim cell As range
Dim keys As Variant
Dim dict As Object
Set dict = CreateObject("scripting.dictionary")
On Error Resume Next
For Each cell In cell_range
If cell.Value = search_text Then
dict.Add cell.Offset(, 1).Value, 1
End If
Next
keys = dict.keys
For i = 0 To UBound(keys)
result = result & (seperator & keys(i))
Next
If Len(result) <> 0 Then
result = Right$(result, (Len(result) - Len(seperator)))
End If
ListUnique = result
Application.ScreenUpdating = True
End Function
How it works: It simple loops through your range looking for the search_string you give it. If it finds it, it adds it to a dictionary object (which will eliminate all dupes). You dump the results in an array then create a string out of them. Technically you can just pass it "B:B" as the search array if you aren't sure where the end of the column is and this function will still work just fine (1/5th of a second for scanning every cell in column B with 1000 unique hits returned).
Another solution would be to do an advancedfilter for Chm123456 and then you could copy those to another range. If you get them in a string array you can use the built-in excel function Join(saString, ",") (only works with string arrays).
Not actual code for you but it points you in a possible direction that can be helpful.
OK, this might be pretty fast for a ton of data. Grabbing the data for each cell takes a ton of time, it is better to grab it all at once. The the unique to paste and then grab the data using
vData=rUnique
where vData is a variant and rUnique is the is the copied cells. This might actually be faster than grabbing each data point point by point (excel internally can copy and paste extremely fast). Another option would be to grab the unique data without having the copy and past happen, here's how:
dim i as long
dim runique as range, reach as range
dim sData as string
dim vdata as variant
set runique=advancedfilter(...) 'Filter in place
set runique=runique.specialcells(xlCellTypeVisible)
for each reach in runique.areas
vdata=reach
for i=lbound(vdata) to ubound(vdata)
sdata=sdata & vdata(i,1)
next l
next reach
Personally, I would prefer the internal copy paste then you could go through each sheet and then grab the data at the very end (this would be pretty fast, faster than looping through each cell). So going through each sheet.
dim wks as worksheet
for each wks in Activeworkbook.Worksheets
if wks.name <> "CopiedToWorksheet" then
advancedfilter(...) 'Copy to bottom of list, so you'll need code for that
end if
next wks
vdata=activeworkbook.sheets("CopiedToWorksheet").usedrange
sData=vdata(1,1)
for i=lbound(vdata) + 1 to ubound(vdata)
sData=sData & ","
next i
The above code should be blazing fast. I don't think you can use Join on a variant, but you could always attempt it, that would make it even faster. You could also try application.worksheetfunctions.contat (or whatever the contatenate function is) to combine the results and then just grab the final result.
On Error Resume Next
wks.ShowAllData
On Error GoTo 0
wks.UsedRange.Rows.Hidden = False
wks.UsedRange.Columns.Hidden = False
rFilterLocation.ClearContents