Load csv file into a VBA array rather than Excel Sheet - vba

I am currently able to enter csv file data into Excel VBA by uploading the data via the code below then handling the table, surely not the best way as I am only interested in some of the data and delete the sheet after using the data:
Sub CSV_Import()
Dim ws As Worksheet, strFile As String
Set ws = ActiveSheet 'set to current worksheet name
strFile = Application.GetOpenFilename("Text Files (*.csv),*.csv", ,"Please select text file...")
With ws.QueryTables.Add(Connection:="TEXT;" & strFile, Destination:=ws.Range("A1"))
.TextFileParseType = xlDelimited
.TextFileCommaDelimiter = True
.Refresh
End With
End Sub
Is it possible to simply load the csv into a two dimensional variant array in VBA rather than going through the use of an excel worksheet?

Okay, looks like you need two things: stream the data from the file, and populate a 2-D array.
I have a 'Join2d' and a 'Split2d' function lying around (I recall posting them in another reply on StackOverflow a while ago). Do look at the comments in the code, there are things you might need to know about efficient string-handling if you're handling large files.
However, it's not a complicated function to use: just paste the code if you're in a hurry.
Streaming the file is simple BUT we're making assumptions about the file format: are the lines in the file delimited by Carriage-Return characters or the Carriage-Return-and-Linefeed character pair? I'm assuming 'CR' rather than CRLF, but you need to check that.
Another assumption about the format is that numeric data will appear as-is, and string or character data will be encapsulated in quote marks. This should be true, but often isn't... And stripping out the quote marks adds a lot of processing - lots of allocating and deallocating strings - which you really don't want to be doing in a large array. I've short-cut the obvious cell-by-cell find-and-replace, but it's still an issue on large files.
If your file has commas embedded in the string values, this code won't work: and don't try to code up a parser that picks out the encapsulated text and skips these embedded commas when splitting-up the rows of data into individual fields, because this intensive string-handling can't be optimised into a fast and reliable csv reader by VBA.
Anyway: here's the source code: watch out for line-breaks inserted by StackOverflow's textbox control:
Running the code:
Note that you'll need a reference to the Microsoft Scripting Runtime (system32\scrrun32.dll)
Private Sub test()
Dim arrX As Variant
arrX = ArrayFromCSVfile("MyFile.csv")
End Sub
Streaming a csv file.
Note that I'm assuming your file is in the temp folder:
C:\Documents and Settings[$USERNAME]\Local Settings\Temp
You'll need to use filesystem commands to copy the file into a local folder: it's always quicker than working across the network.
Public Function ArrayFromCSVfile( _
strName As String, _
Optional RowDelimiter As String = vbCr, _
Optional FieldDelimiter = ",", _
Optional RemoveQuotes As Boolean = True _
) As Variant
' Load a file created by FileToArray into a 2-dimensional array
' The file name is specified by strName, and it is exected to exist
' in the user's temporary folder. This is a deliberate restriction:
' it's always faster to copy remote files to a local drive than to
' edit them across the network
' RemoveQuotes=TRUE strips out the double-quote marks (Char 34) that
' encapsulate strings in most csv files.
On Error Resume Next
Dim objFSO As Scripting.FileSystemObject
Dim arrData As Variant
Dim strFile As String
Dim strTemp As String
Set objFSO = New Scripting.FileSystemObject
strTemp = objFSO.GetSpecialFolder(Scripting.TemporaryFolder).ShortPath
strFile = objFSO.BuildPath(strTemp, strName)
If Not objFSO.FileExists(strFile) Then ' raise an error?
Exit Function
End If
Application.StatusBar = "Reading the file... (" & strName & ")"
If Not RemoveQuotes Then
arrData = Join2d(objFSO.OpenTextFile(strFile, ForReading).ReadAll, RowDelimiter, FieldDelimiter)
Application.StatusBar = "Reading the file... Done"
Else
' we have to do some allocation here...
strTemp = objFSO.OpenTextFile(strFile, ForReading).ReadAll
Application.StatusBar = "Reading the file... Done"
Application.StatusBar = "Parsing the file..."
strTemp = Replace$(strTemp, Chr(34) & RowDelimiter, RowDelimiter)
strTemp = Replace$(strTemp, RowDelimiter & Chr(34), RowDelimiter)
strTemp = Replace$(strTemp, Chr(34) & FieldDelimiter, FieldDelimiter)
strTemp = Replace$(strTemp, FieldDelimiter & Chr(34), FieldDelimiter)
If Right$(strTemp, Len(strTemp)) = Chr(34) Then
strTemp = Left$(strTemp, Len(strTemp) - 1)
End If
If Left$(strTemp, 1) = Chr(34) Then
strTemp = Right$(strTemp, Len(strTemp) - 1)
End If
Application.StatusBar = "Parsing the file... Done"
arrData = Split2d(strTemp, RowDelimiter, FieldDelimiter)
strTemp = ""
End If
Application.StatusBar = False
Set objFSO = Nothing
ArrayFromCSVfile = arrData
Erase arrData
End Function
Split2d
Creates a 2-dimensional VBA array from a string:
Public Function Split2d(ByRef strInput As String, _
Optional RowDelimiter As String = vbCr, _
Optional FieldDelimiter = vbTab, _
Optional CoerceLowerBound As Long = 0 _
) As Variant
' Split up a string into a 2-dimensional array.
' Works like VBA.Strings.Split, for a 2-dimensional array.
' Check your lower bounds on return: never assume that any array in
' VBA is zero-based, even if you've set Option Base 0
' If in doubt, coerce the lower bounds to 0 or 1 by setting
' CoerceLowerBound
' Note that the default delimiters are those inserted into the
' string returned by ADODB.Recordset.GetString
On Error Resume Next
' Coding note: we're not doing any string-handling in VBA.Strings -
' allocating, deallocating and (especially!) concatenating are SLOW.
' We're using the VBA Join & Split functions ONLY. The VBA Join,
' Split, & Replace functions are linked directly to fast (by VBA
' standards) functions in the native Windows code. Feel free to
' optimise further by declaring and using the Kernel string functions
' if you want to.
' ** THIS CODE IS IN THE PUBLIC DOMAIN **
' Nigel Heffernan Excellerando.Blogspot.com
Dim i As Long
Dim j As Long
Dim i_n As Long
Dim j_n As Long
Dim i_lBound As Long
Dim i_uBound As Long
Dim j_lBound As Long
Dim j_uBound As Long
Dim arrTemp1 As Variant
Dim arrTemp2 As Variant
arrTemp1 = Split(strInput, RowDelimiter)
i_lBound = LBound(arrTemp1)
i_uBound = UBound(arrTemp1)
If VBA.LenB(arrTemp1(i_uBound)) <= 0 Then
' clip out empty last row: a common artifact in data
'loaded from files with a terminating row delimiter
i_uBound = i_uBound - 1
End If
i = i_lBound
arrTemp2 = Split(arrTemp1(i), FieldDelimiter)
j_lBound = LBound(arrTemp2)
j_uBound = UBound(arrTemp2)
If VBA.LenB(arrTemp2(j_uBound)) <= 0 Then
' ! potential error: first row with an empty last field...
j_uBound = j_uBound - 1
End If
i_n = CoerceLowerBound - i_lBound
j_n = CoerceLowerBound - j_lBound
ReDim arrData(i_lBound + i_n To i_uBound + i_n, j_lBound + j_n To j_uBound + j_n)
' As we've got the first row already... populate it
' here, and start the main loop from lbound+1
For j = j_lBound To j_uBound
arrData(i_lBound + i_n, j + j_n) = arrTemp2(j)
Next j
For i = i_lBound + 1 To i_uBound Step 1
arrTemp2 = Split(arrTemp1(i), FieldDelimiter)
For j = j_lBound To j_uBound Step 1
arrData(i + i_n, j + j_n) = arrTemp2(j)
Next j
Erase arrTemp2
Next i
Erase arrTemp1
Application.StatusBar = False
Split2d = arrData
End Function
Join2D
Turns a 2-dimensional VBA array to a string:
Public Function Join2d(ByRef InputArray As Variant, _
Optional RowDelimiter As String = vbCr, _
Optional FieldDelimiter = vbTab, _
Optional SkipBlankRows As Boolean = False _
) As String
' Join up a 2-dimensional array into a string. Works like the standard
' VBA.Strings.Join, for a 2-dimensional array.
' Note that the default delimiters are those inserted into the string
' returned by ADODB.Recordset.GetString
On Error Resume Next
' Coding note: we're not doing any string-handling in VBA.Strings -
' allocating, deallocating and (especially!) concatenating are SLOW.
' We're using the VBA Join & Split functions ONLY. The VBA Join,
' Split, & Replace functions are linked directly to fast (by VBA
' standards) functions in the native Windows code. Feel free to
' optimise further by declaring and using the Kernel string functions
' if you want to.
' ** THIS CODE IS IN THE PUBLIC DOMAIN **
' Nigel Heffernan Excellerando.Blogspot.com
Dim i As Long
Dim j As Long
Dim i_lBound As Long
Dim i_uBound As Long
Dim j_lBound As Long
Dim j_uBound As Long
Dim arrTemp1() As String
Dim arrTemp2() As String
Dim strBlankRow As String
i_lBound = LBound(InputArray, 1)
i_uBound = UBound(InputArray, 1)
j_lBound = LBound(InputArray, 2)
j_uBound = UBound(InputArray, 2)
ReDim arrTemp1(i_lBound To i_uBound)
ReDim arrTemp2(j_lBound To j_uBound)
For i = i_lBound To i_uBound
For j = j_lBound To j_uBound
arrTemp2(j) = InputArray(i, j)
Next j
arrTemp1(i) = Join(arrTemp2, FieldDelimiter)
Next i
If SkipBlankRows Then
If Len(FieldDelimiter) = 1 Then
strBlankRow = String(j_uBound - j_lBound, FieldDelimiter)
Else
For j = j_lBound To j_uBound
strBlankRow = strBlankRow & FieldDelimiter
Next j
End If
Join2d = Replace(Join(arrTemp1, RowDelimiter), strBlankRow, RowDelimiter, "")
i = Len(strBlankRow & RowDelimiter)
If Left(Join2d, i) = strBlankRow & RowDelimiter Then
Mid$(Join2d, 1, i) = ""
End If
Else
Join2d = Join(arrTemp1, RowDelimiter)
End If
Erase arrTemp1
End Function
Share and enjoy.

Yes read it as a text file.
See this example
Option Explicit
Sub Sample()
Dim MyData As String, strData() As String
Open "C:\MyFile.CSV" For Binary As #1
MyData = Space$(LOF(1))
Get #1, , MyData
Close #1
strData() = Split(MyData, vbCrLf)
End Sub
FOLLOWUP
Like I mentioned below in the comments, AFAIK, there is no direct way of filling a 2d Array from a csv. You will have to use the code that I gave above and then split it per line and finally filling up a 2D array which can be cumbersome. Filling up a column is easy but if you specifically want say from Row 5 to Col 7 Data then it becomes cumbersome as you will have to check if there are sufficient columns/rows in the data. Here is a basic example to get Col B in a 2D Array.
NOTE: I have not done any error handling. I am sure you can take care of that.
Let's say our CSV File looks likes this.
When you run this code
Option Explicit
Const Delim As String = ","
Sub Sample()
Dim MyData As String, strData() As String, TmpAr() As String
Dim TwoDArray() As String
Dim i As Long, n As Long
Open "C:\Users\Siddharth Rout\Desktop\Sample.CSV" For Binary As #1
MyData = Space$(LOF(1))
Get #1, , MyData
Close #1
strData() = Split(MyData, vbCrLf)
n = 0
For i = LBound(strData) To UBound(strData)
If Len(Trim(strData(i))) <> 0 Then
TmpAr = Split(strData(i), Delim)
n = n + 1
ReDim Preserve TwoDArray(1, 1 To n)
'~~> TmpAr(1) : 1 for Col B, 0 would be A
TwoDArray(1, n) = TmpAr(1)
End If
Next i
For i = 1 To n
Debug.Print TwoDArray(1, i)
Next i
End Sub
You will get the output as shown below
BTW, I am curious that since you are doing this in Excel, why not use inbuilt Workbooks.Open or QueryTables method and then read the range into a 2D array? That would be much simpler...

OK, after looking into this, the solution I have arived at is to use ADODB (requires reference to ActiveX Data Objects, this loads the csv file into array without cycling the rows columns. Does require the data to be in good condition.
Sub LoadCSVtoArray()
strPath = ThisWorkbook.Path & "\"
Set cn = CreateObject("ADODB.Connection")
strcon = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" & strPath & ";Extended Properties=""text;HDR=Yes;FMT=Delimited"";"
cn.Open strcon
strSQL = "SELECT * FROM SAMPLE.csv;"
Dim rs As Recordset
Dim rsARR() As Variant
Set rs = cn.Execute(strSQL)
rsARR = WorksheetFunction.Transpose(rs.GetRows)
rs.Close
Set cn = Nothing
[a1].Resize(UBound(rsARR), UBound(Application.Transpose(rsARR))) = rsARR
End Sub

To get a known format csv data file into a 2D array I finally adopted the following method, which seems to work well and is quite quick.
I decided that file read operations are fairly fast nowadays, so I run a first pass on the csv file to get the size required for both dimension of the array. With the array suitably dimensioned it is then a simple task to re-read the file, line by line, and populate the array.
Function ImportTestData(ByRef srcFile As String, _
ByRef dataArr As Variant) _
As Boolean
Dim FSO As FileSystemObject, Fo As TextStream
Dim line As String, Arr As Variant
Dim lc As Long, cc As Long
Dim i As Long, j As Long
ImportTestData = False
Set FSO = CreateObject("Scripting.FilesystemObject")
Set Fo = FSO.OpenTextFile(srcFile)
' First pass; read the file to get array size
lc = 0 ' Counter for number of lines in the file
cc = 0 ' Counter for number of columns in the file
While Not Fo.AtEndOfStream ' Read the csv file line by line
line = Fo.ReadLine
If lc = 0 Then ' Count commas to get array's 2nd dim index
cc = 1 + Len(line) - Len(Replace(line, ",", ""))
End If
lc = lc + 1
Wend
Fo.Close
' Set array dimensions to accept file contents
ReDim dataArr(0 To lc - 1, 0 To cc - 1)
'Debug.Print "CSV has "; n; " rows with "; lc; " fields/row"
If lc > 1 And cc > 1 Then
ImportTestData = True
End If
' Second pass; Re-open data file and copy to array
Set Fo = FSO.OpenTextFile(srcFile)
lc = 0
While Not Fo.AtEndOfStream
line = Fo.ReadLine
Arr = Split(line, ",")
For i = 0 To UBound(Arr)
dataArr(lc, i) = Arr(i)
Next i
lc = lc + 1
Wend
End Function 'ImportTestData()
I created this as a Function rather than a Sub to get a simple return value, if required.
Reading a file with 8,500 rows of 20 columns takes approximately 180ms.
This method assumes that the structure (number of delimiters) of the CSV file is the same for every row, typical of a data logging application.

The following solution does not use ActiveX:
I wrote code to import a csv (actually tab-separated) file into an array. That code is the following.
First let's designate the array (initially it is completely void but it will be resized appropriately later):
Dim TxtFile$()
Now for the sub-procedure:
' Fills TxtFile$() array
Sub FillTextFileArray(A$)
'***********************************************************************
' Declarations
'***********************************************************************
Dim I, J As Integer
Dim LineString As String
'***********************************************************************
I = -1: J = 0 ' Will hold array dimentions
Open A$ For Input As #1
Do While Not EOF(1) ' Loop until end of file.
Line Input #1, LineString
LineString = LineString + vbTab ' If not done empty lines give error with Split()
I = I + 1
If J < UBound(Split(LineString, vbTab)) Then J = UBound(Split(LineString, vbTab))
Loop
ReDim TxtFile$(1 To I + 4, 1 To J + 4) ' Not indexed from 0 ! (Plus some room at the end.) This is done to match worksheet format.
Seek #1, 1 ' Reset to start
I = -1 ' Will hold array row index
Do While Not EOF(1) ' Loop until end of file.
Line Input #1, LineString
LineString = LineString + vbTab ' If not done empty lines give error with Split()
I = I + 1
For J = 0 To UBound(Split(LineString, vbTab))
TxtFile$(I + 1, J + 1) = Split(LineString, vbTab)(J)
Next J
Loop
Close #1 ' Close file.
' TxtFile$() now holds the contents of the text file
End Sub
Obviously you can then do what you want with the TxtFile$ array. A$ is the location and name of the text file. As already said, this particular code works with tab-delimited files (vbTab), not comma-delimited (separated), but any adaptation should not be too difficult. It has the advantage of avoiding ActiveX complications.

Alternatively you can use a code like this
Dim line As String, Arr
Dim FSO As Object, Fo As Object
Set FSO = CreateObject("Scripting.FileSystemObject")
Set Fo = FSO.OpenTextFile("csvfile.csv")
While Not Fo.AtEndOfStream
line = Fo.ReadLine ' Read the csv file line by line
Arr = Split(line, ",") ' The csv line is loaded into the Arr as an array
For i = 0 To UBound(Arr) - 1: Debug.Print Arr(i) & " ";: Next
Debug.Print
Wend
01/01/2019 1 1 1 36 55.6 0.8 85.3 95 95 109 102 97 6 2.5 2.5 3.9
01/01/2019 1 2 0 24 0.0 2.5 72.1 89 0 0 97 95 10 6.7 4.9 3.9
01/01/2019 1 3 1 36 26.3 4 80.6 92 92 101 97 97 8 5.5 5.3 3.7
01/01/2019 1 4 0 16 30.0 8 79.2 75 74 87 87 86 10 3.8 4 4.2

These days, GitHub hosts at least three CSV parsers that do exactly what the OP asked for - load a CSV file into a VBA array.
I'm the author of this one:
https://github.com/PGS62/VBA-CSV
It handles a broad variety of CSV files, including those with "embedded" commas, line-feeds etc, and those with a varying number of fields per row. I provide links to alternative VBA CSV parsers in the README file.

Related

Code problem with a format of files which the loop goes over

I have a problem with a VBA code. The macro below is suppose to go to the destination folder which contains only Excel file but with different extension (xls, xlsx, xlsm) and loop over the exising files to find the larges number within the names of the files (the exaples of current files are DelKra 2021-()-162.xls; DelKra 2021-()-163.xls; DelKra 2021-()-164.xlsm).
The macro run smoothly only when the destination folder contains xls Excel files but crashes whenever another type of Excel file is saved in the folder. The command the macro crashes at is:
"CurrentNum = Mid(FileName, Len(FileName) - 6, 3)".
Please help me to fix my macro.*
Sub ConfirmAndSaveDel()
DestinationFolder = "\\oscwawfs01.kingfisherasia.com.hk\common\FINANCE\Public\BUSINESS
TRIPS\Business Trip Delegacje\2021\Domestic\"
Dim FileArray() As Variant
Dim FileCount As Integer
Dim FileName As String
Dim LastNum As Integer
Dim CurrentNum As Integer
Dim Numerek As String
Dim whereTrip As String
Dim purposeTrip As String
Dim whoTrip As String
Dim startTrip As Date
Dim endTrip As Date
Dim LastRow As Integer
LastNum = 0
FileCount = 0
FileName = Dir(DestinationFolder)
'Loop searching all files
Do While FileName <> ""
FileCount = FileCount + 1
ReDim Preserve FileArray(1 To FileCount)
FileArray(FileCount) = FileName
'Take from the file name numbers 6,5 i 4 counting from the right
CurrentNum = Mid(FileName, Len(FileName) - 6, 3)
'If it is larger than the current one remember it
If CurrentNum > LastNum Then
LastNum = CurrentNum
End If
'Debug.Print FileName
FileName = Dir()
Loop
'Add 1 to the largest number found
LastNum = LastNum + 1
'Debug.Print LastNum
'Change the numer to string and add as many zeros at the beginning of the number to have it as the three digit number
If Len(Trim(CStr(LastNum))) = 1 Then
Numerek = "00" & CStr(LastNum)
ElseIf Len(Trim(CStr(LastNum))) = 2 Then
Numerek = "0" & CStr(LastNum)
ElseIf Len(Trim(CStr(LastNum))) = 3 Then
Numerek = CStr(LastNum)
End If
'Combine the whole name of the new file
NazwaPliku = "DelKra 2021-" & "(" & Range("FRIFAR").Value & ")-" & Numerek
Try using such a function:
Function extractNumber(strName As String) As Long
Dim arr: arr = Split(strName, "-")
extractNumber = Split(arr(Ubound(arr)), ".")(0)
End Function
Copy the above function in the same module and call it as:
CurrentNum = extractNumber(fileName)
I mean, replace CurrentNum = Mid(FileName, Len(FileName) - 6, 3) with the above way. It is independent of extension number of characters.
And besides that, please replace all declarations As Integer with As Long. In VBA that way of declaring does not bring any benefit in terms of memory handling or from any other point of view... It is good to cultivate such a habit in all cases. But if you like your way, please adapt the function to return As Integer...
The following function will extract the part of the filename between the last dash and the last dot of the filename. If it is numeric, it will return that number, else (or if the filename doesn't follow the pattern) 0.
Function getFileNumber(filename As String) As Long
Dim pDash As Long, pDot As Long
pDash = InStrRev(filename, "-")
pDot = InStrRev(filename, ".")
If pDash = 0 Or pDot = 0 Or pDot < pDash Then Exit Function
Dim suffix As String
suffix = Mid(filename, pDash + 1, pDot - pDash- 1)
If IsNumeric(suffix) Then
getFileNumber = Val(suffix)
End If
End Function

How to keep leading zeros when opening CSV file in VBA

I have a VBA code that quickly transfer data from CSV files, but unfortunately exclude leading zeros (For example 000123 is converted to 123)
Filename = "c:\text.csv"
Dim wbI As Workbook, wbO As Workbook
Dim wsI As Worksheet
Set wbI = ThisWorkbook
Set wsI = wbI.Sheets("Temp")
Filename = "c:\text.csv"
Set wbO = Workbooks.Open(Filename)
wbO.Sheets(1).Cells.Copy wsI.Cells
wbO.Close SaveChanges:=False
I have tried to add the following after opening the csv file > Cells.NumberFormat = "#"
Set wbO = Workbooks.Open(Filename)
Cells.NumberFormat = "#"
wbO.Sheets(1).Cells.Copy wsI.Cells
wbO.Close SaveChanges:=False
Unfortunately, it is not working and the problem I see is that once the file opens already is missing the leading zeros
Is it possible to open the file without affecting the leading zeros and show all the data as text to maintain the leading zeros?
Try this way, please:
Sub testOpenWithLZeroTxt()
Dim Filename As String, wbI As Workbook, wbO As Workbook, wsI As Worksheet
Dim arrTXT, nrCol As Long, arr(), i As Long, sep As String, lineSep As String
Dim allTxt As String, txtStr As Object, fileTxt As String, fs As Object, f As Object, ts As Object
Set wbI = ThisWorkbook
Set wsI = wbI.Sheets("Temp")
Filename = "c:\text.csv"
fileTxt = Split(Filename, ".")(0) & ".txt" 'create a helper txt file using the csv string content
Set fs = CreateObject("Scripting.FileSystemObject")
allTxt = fs.OpenTextFile(Filename, 1).ReadAll 'reed the csv file content
fs.CreateTextFile fileTxt
Set f = fs.GetFile(fileTxt)
Set ts = f.OpenAsTextStream(2, -2)
ts.write allTxt 'write the csv content in a newly created txt file
ts.Close
'Check the number of text file columns:_______
sep = vbLf ' if not working you can try vbCrLf. It works so on your file
lineSep = "," 'it my be vbTab, ";" etc. It works so on your file
arrTXT = Split(allTxt, sep)
nrCol = UBound(Split(arrTXT(0), lineSep))
'_____________________________________________
ReDim arr(nrCol) 'redim the format array
For i = 0 To nrCol
arr(i) = Array(i + 1, 2) 'fill the format array with variant for TEXT Format!
Next
'open the helper txt file as you need:
Workbooks.OpenText Filename:=fileTxt, origin:=437, startRow:=1, _
DataType:=xlDelimited, Tab:=False, Comma:=True, FieldInfo:=arr()
Set wbO = ActiveWorkbook
'wbO.Sheets(1).cells.Copy wsI.Range("A1") 'copy the content
wbO.Close SaveChanges:=False 'close the file
Kill fileTxt 'kill helper txt file
End Sub
Edited:
I changed the code philosophy. It will firstly read the csv content in a string variable and create a txt file using the obtained string and open it as text, which certainly should work. It will work for any number of columns in the csv file.
The line break in your csv file is unix LF. This corresponds to chr(10).
Since the number of columns in the first row and the number of columns in the next row are inconsistent, a little bias was used. An array was created by doubling the number of columns in the first row.
Sub test()
Dim Ws As Worksheet
Dim Fn As String
Dim Arr As Variant
Fn = "Example.csv"
'Fn = "c:\text.csv"
Set Ws = Sheets("Temp")
Arr = getDatFromCsv(Fn)
With Ws
.Cells.NumberFormat = "#"
.Cells = Empty
.Range("a1").Resize(UBound(Arr, 1) + 1, UBound(Arr, 2) + 1) = Arr
End With
End Sub
Function getDatFromCsv(strFn As String) As Variant
Dim vR() As String
Dim i As Long, r As Long, j As Integer, c As Integer
Dim objStream As Object
Dim strRead As String
Dim vSplit, vRow
Dim s As String
Set objStream = CreateObject("ADODB.Stream")
With objStream
.Charset = "utf-8"
.Open
.LoadFromFile strFn
strRead = .ReadText
.Close
End With
vSplit = Split(strRead, Chr(10)) 'Unix Lf ~~> chr(10)
r = UBound(vSplit)
c = UBound(Split(vSplit(0), ",", , vbTextCompare))
ReDim vR(0 To r, 0 To c * 2)
For i = 0 To r
vRow = Split(vSplit(i), ",", , vbTextCompare)
'If UBound(vRow) = c Then 'if it is empty line, skip it
For j = 0 To UBound(vRow)
vR(i, j) = vRow(j)
Next j
'End If
Next i
getDatFromCsv = vR
Set objStream = Nothing
End Function
Result Image
Use OpenText method instead.
The most important parameter is FieldInfo. You need to pass:
an array containing parse information for individual columns of data. The interpretation depends on the value of DataType. When the data is delimited, this argument is an array of two-element arrays, with each two-element array specifying the conversion options for a particular column. The first element is the column number (1-based), and the second element is one of the XlColumnDataType constants specifying how the column is parsed.
In other words, every column with leading zeros, has to be defined as xlTextFormat.
I'd suggest to record macro. ;) An option to load text data, you'll find under Data tab -> ... -> From text/CSV

Saving a MS Word document with consecutive numbering

#PKatona when I tried it in a real setting it overwrote some files. Upon examining the code, I realized it was counting the number of files in the directory and saving as the next number (say 15th file in folder as 'ST14 TC15') instead of saving as the highest next number in the filenames (say there are only 3 files in the directory and the one with the highest ending is 'ST14 TC06' so the next file should be saved as 'ST14 TC07'. I hope that makes sense. But using some of your code I was able to come up with this: however 1) there must be a way to shorten it! 2) it works in Excel (where I made it) but 'Evaluate' line towards the end gives 'Sub or function not defined' error in Word!!
Thanks again
`Sub Largest()
Dim rng As Range
Dim dblMax As Double
Dim var_data(200)
Dim var_numdata(200)
'* - * - *
'to put filenames in a specific directory into an array
Dim MyFile As String
Dim Counter As Long
'Create a dynamic array variable, and then declare its initial size
Dim DirectoryListArray() As String
ReDim DirectoryListArray(1000)
Dim str()
ReDim str(1000)
Dim num()
ReDim num(1000)
'Loop through all the files in the directory by using Dir$ function
MyFile = Dir$("C:\HAPPY\SANTA\ELVES\*.docx")
Do While MyFile <> ""
DirectoryListArray(Counter) = MyFile
MyFile = Dir$
str(Counter) = Mid(DirectoryListArray(Counter), 8, 3)
num(Counter) = Evaluate(str(Counter))
Counter = Counter + 1
Loop
'Reset the size of the array without losing its values
ReDim Preserve DirectoryListArray(Counter - 1)
ReDim Preserve str(Counter - 1)
ReDim Preserve num(Counter - 1)
dblMax = Application.WorksheetFunction.Max(num())
Dim nextFilename As String
nextFilename = "C:\HAPPY\SANTA\ELVES\ST14 HP" + Format((dblMax + 1), "000")+ ".docx"
ActiveDocument.SaveAs Filename:=nextFilename
ActiveDocument.Close
End Sub
This will find the last file sequence:
Dim filename as String
Dim seq as Integer
seq = 1
filename = Dir("C:\HAPPY\SANTA\ELVES\ST14 TC*.docx")
Do While filename <> ""
seq = seq + 1
filename = Dir
Loop
Dim nextFilename as String
nextFilename = "C:\HAPPY\SANTA\ELVES\ST14 TC" + Format(seq, "000") + ".docx"
Add your macro code here...

How to read the second last line in a text file

I would like to read a large file in VBA and saw this code online:
Dim MyChar As String, Pointer As Long, LastLine As String
Open "MyTextFile.Txt" For Binary As #1
Pointer = LOF(1) - 2
MyChar = Chr$(32)
Do
Get #1, Pointer, MyChar
If MyChar = vbCr Or MyChar = vbLf Then
Exit Do
Else: Pointer = Pointer - 1
LastLine = MyChar & LastLine
End If
Loop
MsgBox "Last Line is " & LastLine
How do I change this code to get the second last line? Need some help on this.
Thought of this:
Set objFSO = CreateObject("Scripting.FileSystemObject")
Set objTextFile = objFSO.OpenTextFile _
"MyTextFile.Txt", 1)
objTextFile.ReadAll
MsgBox objTextFile.Line
But I can't get to the line-2.
The code you're providing works as follows:
It sets a pointer to the last character of the file
It then reads that file backwards until it finds a linebreak
It returns all it has read as last line.
To modify this for your needs, I have added a Boolean secondRun, which lets the code run step 2 again, thus recording the 2nd last line:
Dim MyChar As String, Pointer As Long, LastLine As String
Open "MyTextFile.Txt" For Binary As #1
Pointer = LOF(1) - 2
MyChar = Chr$(32)
Dim secondRun As Boolean
Do
' Read character at position "Pointer" into variable "MyChar"
Get #1, Pointer, MyChar
If MyChar = vbCr Or MyChar = vbLf Then ' Linebreak = line read completely
If Not secondRun Then
' Run again if we've read only one line so far
secondRun = True
LastLine = ""
Pointer = Pointer - 2
Else
Exit Do
End If
Else: Pointer = Pointer - 1
' Add character to result String
LastLine = MyChar & LastLine
End If
Loop
MsgBox " 2nd last line is " & LastLine
Depends on your approach. But if the files are really that big then you probably don't want Excel to load the entire file. So, you'll probably open the files and read line by line without knowing how big the file is and how many rows it has. In that case it's probably easiest do just store two lines at a time in two separate string variables. As soon as you hit the last row you can exit your loop - as shown above in your code - and output not only the last row (as is already done in your code) but also the content of the second last row in that file.
Public Sub GetSecondLastRow()
Dim strSecondLastLine As String
Dim strFileToImport As String
Dim strLastLine As String
Dim intPointer As Integer
Dim lngCounter As Long
strFileToImport = ThisWorkbook.Path & IIf(InStr(1, ThisWorkbook.Path, "\") > 0, "\", "/") & "MyTextFile.txt"
intPointer = FreeFile()
Open strFileToImport For Input Access Read Lock Read As #intPointer
lngCounter = 0
Do Until EOF(lngCounter)
strSecondLastLine = strLastLine
Line Input #intPointer, strLastLine
lngCounter = lngCounter + 1
Loop
Close intPointer
Debug.Print "Content of the second last row:"
Debug.Print "---------------------------------------"
Debug.Print strSecondLastLine
Debug.Print "---------------------------------------"
Debug.Print "Content of the last row:"
Debug.Print "---------------------------------------"
Debug.Print strLastLine
End Sub
The alternative would be to first query the file for its row count and then get the second last record in that file using ADO. But I doubt that would be faster. The problem with ADO is that you get a huge recordset back containing the entire text file. This is due to the fact that you have no where restriction in the clause SELECT * from MyTextFile.txt. So, the entire text file goes into memory before you can do anything with it. Then - of course - you can check the RecordCount and go again through all records with a cursor fast forward until you hit the second last row. Unfortunately, ADO does not support
row_number() over (order by ##ROWCOUNT).
Otherwise, you could first get the row count with select count(1) from MyTextFile.txt and then afterwards only the applicable row.
So, in any case, I am almost certain (without having tested it) that ADO will perform below par and the first solution is the way to go if the text files are as big as you say. If you still prefer ADO then this is the code for that (based on the following SO question / answer: Copying text from .txt file in Excel using ADO ignores first row).
Sub ImportTextFile()
'Imports text file into Excel workbook using ADO.
'If the number of records exceeds 65536 then it splits it over more than one sheet.
Dim strFilePath As String, strFilename As String, strFullPath As String
Dim lngCounter As Long
Dim oConn As ADODB.Connection
Dim oRS As ADODB.Recordset
Dim oFSObj As Object
'Get a text file name
strFullPath = Application.GetOpenFilename("Text Files (*.txt),*.txt", , "Please select text file...")
If strFullPath = "False" Then Exit Sub 'User pressed Cancel on the open file dialog
'This gives us a full path name e.g. C:\temp\folder\file.txt
'We need to split this into path and file name
Set oFSObj = CreateObject("SCRIPTING.FILESYSTEMOBJECT")
strFilePath = oFSObj.GetFile(strFullPath).ParentFolder.Path
strFilename = oFSObj.GetFile(strFullPath).Name
'Open an ADO connection to the folder specified
Set oConn = New ADODB.Connection
oConn.Open "Provider=Microsoft.Jet.OLEDB.4.0;" & _
"Data Source=" & strFilePath & ";" & _
"Extended Properties=""text;HDR=No;FMT=Delimited"""
Set oRS = New ADODB.Recordset
'Now actually open the text file and import into Excel
oRS.Open "SELECT count(1) FROM [" & strFilename & "]", oConn, 3, 1, 1
Range("A1").CopyFromRecordset oRS
Set oRS = New ADODB.Recordset
'Now actually open the text file and import into Excel
oRS.Open "SELECT * FROM [" & strFilename & "]", oConn, 3, 1, 1
While Not oRS.EOF And Not oRS.BOF
If oRS.AbsolutePosition = Range("A1").Value2 Then
Range("A2").Value = oRS.Fields(0).Value
End If
oRS.MoveNext
Wend
oRS.Close
oConn.Close
End Sub
You can try this:
Public Function GetSecondLastLine(sFileName As String, Optional sLineDelimiter As String = vbCrLf) As String
Dim sContent As String
Dim aLines() As String
sContent = TextFromFile(sFileName)
aLines = Split(sContent, sLineDelimiter)
GetSecondLastLine = aLines(UBound(aLines) - 1)
End Function
Public Function TextFromFile(sFileName As String) As String
Dim lFile As Long
lFile = FreeFile
Open sFileName For Input As #lFile
TextFromFile = Input$(LOF(lFile), lFile)
Close #lFile
End Function
If necessary, you can change the line delimiter (e.g. vbCR of vbLF)
"True to request" version:
(Improved on #Verzweifler's answer (imho))
Linux compatible (Linebreaks with LF only instead of CR LF possible)
Accounting for multiple open files
Using an empty, fixed-length defined string as char (no assignment needed)
'Variables
Dim Path As String
Path = "MyTextFile.txt"
Dim FileNumber As Long
FileNumber = FreeFile 'Use first unused file number
Dim Pointer As Long
Dim Char As String * 1 'String of fixed length 1
Dim SecondLastLine As String
Dim SecondRun As Boolean
'Read last two lines of file
Open Path For Binary As FileNumber
Pointer = LOF(FileNumber) 'Set pointer to last file position
Do
Get FileNumber, Pointer, Char 'Read char at position "Pointer" into "Char"
If Char = vbCr Then
Pointer = Pointer - 1 'Just skip CRs for Linux compat
ElseIf Char = vbLf Then
If Not SecondRun Then
SecondRun = True
SecondLastLine = vbNullString
Pointer = Pointer - 1
Else
Exit Do
End If
Else
Pointer = Pointer - 1
SecondLastLine = Char & SecondLastLine 'Add char to result String
End If
Loop
Close FileNumber
MsgBox " 2nd last line is " & SecondLastLine
Extended version:
Gets Count number of last lines
'Variables
Dim Path As String
Path = "MyTextFile.txt"
Dim Count As Long
Count = 2
Dim FileNumber As Long
FileNumber = FreeFile 'Use first unused file number
Dim Pointer As Long
Dim Char As String * 1 'String of fixed length 1
Dim CurrentLineNumber As Long
CurrentLineNumber = 0
Dim LastLines() As String
ReDim LastLines(0 To Count - 1)
'Read Count last lines of file
Open Path For Binary As FileNumber
Pointer = LOF(FileNumber) 'Set pointer to last file position
Do
Get FileNumber, Pointer, Char 'Read char at position "Pointer" into "Char"
If Char = vbCr Then
Pointer = Pointer - 1 'Just skip CRs for Linux compat
ElseIf Char = vbLf Then
If CurrentLineNumber < Count - 1 Then
CurrentLineNumber = CurrentLineNumber + 1
Pointer = Pointer - 1
Else
Exit Do
End If
Else
Pointer = Pointer - 1
LastLines(CurrentLineNumber) = Char & LastLines(CurrentLineNumber) 'Add char to result String
End If
Loop
Close FileNumber
Dim Line As Variant
For Each Line In LastLines
Debug.Print Line
Next

Parsing Data in Excel Causes Crash

I was wondering if anyone knew a way to parse rather large data files in Excel VBA because whenever I try the simple data parse it crashes the program. The data is formatted as such
593972,Data,15:59:59.820,9519,9519,Px(25.5),9519,9500,10001,10226,10451,0,0,0,0,0,28.7604,25.4800,25.4841
and there are about 3 million lines formatted exactly the same and I want to pull out certain values in the line if the first value (in the case above it is 593972) is a specific number. I am rather new to VBA so any help would be much appreciated. Thanks so much for your time!
Try using FSO; modify to suit your needs.
Sub ParseFile()
Dim fso As Object
Set fso = CreateObject("Scripting.FileSystemObject")
Dim strLine As String
Dim arrLine() As String
Dim objFile
Const ForReading = 1, ForWriting = 2, ForAppending = 8
Set objFile = fso.OpenTextFile("C:\Temp\Text File.txt", ForReading) '<modify path as needed
Do Until objFile.AtEndOfStream
strLine = Trim(objFile.Readline)
If (strLine <> "") Then
arrLine = Split(strLine, ",") 'one dimensional array
'parse the arrLine to test for the data you need
Dim FirstValue as String
FirstValue = arrLine(0)
If FirstValue = "593972" Then
'put the data in Excel if desired/needed
End If
End If
Loop
objFile.Close
Set objFile = Nothing
End Sub
The Sub below opens a text stream, reads it line by line, and verifies if the first field has a certain value for each line; adapt it to do what you'd want:
Public Sub ReadAndValidate( _
ByVal FileName As String, _
ByVal FieldKey As String _
)
' This function doesn't do error handling, assumes that the '
' field separator is "," and that the key field is first. '
' It uses the "Scripting" lib; "Microsoft Scripting Runtime"'
' needs to be referenced by the containing workbook. '
Dim line As String
Dim keylen As Long
Dim fs As Scripting.FileSystemObject
Dim f As Scripting.TextStream
Let FieldKey = FieldKey & "," ' add the separator to the key '
Let keylen = Strings.Len(FieldKey)
Set fs = CreateObject("Scripting.FileSystemObject")
Set f = fs.OpenTextFile( _
FileName:=FileName, _
IOMode:=IOMode.ForReading _
)
While Not f.AtEndOfStream
Let line = f.ReadLine()
If Strings.Left$(line, keylen) = FieldKey Then
' replace the statement below with your code '
Debug.Print line
End If
Wend
f.Close
End Sub