I want to read a text file into an array line by line. When the file contains a TAB character inside a line, this TAB character is completely ignored. => how to avoid TAB skipping easily?
Code basically looks like
Open strFilePath For Input As #1
arrIndex = 0
Do While Not EOF(1)
Line Input #1, my_row
ReDim Preserve strArrLines(0 To arrIndex)
strArrLines(arrIndex) = my_row
arrIndex = arrIndex + 1
Loop
Reading the file line by line is a must, since the lines get filtered before they're put into the array (not shown in the code here, to show it as easy as possible).
I am writing a program in VB.NET which loops through a file with some file paths in it to perform an action on. The file paths in this file are each on a line, and i'm looping through the file like:
Dim FileContents As String
FileContents = System.IO.File.ReadAllText("C:\File.txt")
Dim FileSplit As String()
FileSplit = FileContents.Split(vbCrLf)
For Each ThisLine In FileSplit
Dim FileModified As Date
FileModified = System.IO.File.GetLastWriteTime(ThisLine)
'Do something here
Next
Contents of File.txt is:
Y:\Users\localadmin\Desktop\MakeShadowCopy\FileInfo.vb
Y:\Users\localadmin\Desktop\MakeShadowCopy\FindFiles.vb
Y:\Users\localadmin\Desktop\MakeShadowCopy\MakeShadowCopy.sln
Y:\Users\localadmin\Desktop\MakeShadowCopy\MakeShadowCopy.v12.suo
The loop works fine, but it is throwing an exception on the line with GetLastWriteTime() on it, saying that the path contains illegal characters, but it is just a normal string with a file path in it.
If anyone has any ideas, or know how to escape the string going into GetLastWriteTime() that would be much appreciated :)
Thanks!
Probably the lines in your file are not correctly vbCrLf terminated.
If this is the case the Split cannot divide correctly your input in lines and you end up with the whole text passed to the GetLastWriteTime.
Instead of using ReadAllText you could use ReadAllLines and let the work to split the lines to the Framework that knows how to handle the file line break and carriage return codes.
For Each ThisLine In System.IO.File.ReadAllLines("C:\file.txt")
Dim FileModified As Date
FileModified = System.IO.File.GetLastWriteTime(ThisLine.Trim())
Next
Also add a Trim to the ThisLine variable to remove some unseen character added erroneusly to the line
Two ideas:
Use For instead of For Each and ensure that you're getting exception on the very first iteration. If not, you may have issues with one specific file path. Check out iteration variable value if that is the case.
Open the file in a hex editor and ensure that each line is terminating properly. You might have either CR (10) or LF(13) character at the end but not both as normal in Windows.
I would like to write some data to a csv file. I am using this code:
Dim Filename As String, line As String
Dim A As Integer
Filename = "D:" & "\testfile.csv"
Open Filename For Output As #1
For A = 1 To 100
Print #1, "test, test, test"
Next A
Close #1
but the problem is, that this code rewrite this cvs file from the beginning. but I would like to add data at the end of csv file ( For example if I run this code three times, I would like to have 300 lines in this csv file)
what should I do?
In which case you need Open Filename For Append As #1.
You might also find that Write #1, behaves better than Print #1, if you line contains quotation characters.
One last thing, don't hardcode the #1 as someone else may be using that handle. Instead, use
Dim n as Integer
n = Freefile 'Let VBA find a free file handle
'use #n rather than #1 from here.
Here is your Error:
Open Filename For Output As #1
which should be:
Open Filename For Append As #1
This will append your new text to the end of a stream.
I am trying to make some code that will generate a random number and then check numbers on each line in a text file to see if has already been generated. I have everything but code that will check for the number generated in the text file. Any ideas?
Here is the code I have so far:
Dim Rlo As New IO.StreamReader("C:\Users\Somebody\Documents\Visual Studio 2012\Projects\RobloxRecruitV1\RobloxRecruitV1\bin\Debug\" & TheFileName.Text & ".txt")
Dim firstLine As String
'read first line
firstLine = Rlo.ReadLine()
'read secondline
TheText.Text = Rlo.ReadLine()
rndnumber = New Random
number = rndnumber.Next(firstLine, TheText.Text)
TextBox1.Text = number.ToString
I can't give you the exact code (It's been a long time since I did anything in VB6...)
but....
I can tell you that using a stream reader is the wrong approach.
A stream reader is exactly what it's name suggests. A constant stream of data, it starts and then stops when it reaches an end.
Now while it's true that you can to a small extent seek back and forth in a stream, that's not really what you need in this case.
What you need is to load all the lines of your file into an in memory array or some kind of hash table, then your task simply becomes one of looking to see if a given index exists.
If you have no choice but to use the file as is on disk (Due to size restrictions for example) then the approach you need is this:
1) Open the file
2) Set you position to the beginning
3) enter a loop reading sequential lines
4) once you have the line that corresponds to the count your looking for close the file and end
5) loop back round until no more lines left
6) close the file
opening and closing, then resetting each time is important, this is so that you KNOW EXACTLY where in the file your starting from each time, you could in theory keep the file open and just reset the position, but that in my mind could be dangerous esp if you have other processes writing to it.
If your file is not very big, then I'd opt for an in memory approach, load the file, perform operations on the in memory array of lines, then save it before exit.
I believe I have come up with a very efficient way to read very, very large files line-by-line. Please tell me if you know of a better/faster way or see room for improvement. I am trying to get better at coding, so any sort of advice you have would be nice. Hopefully this is something that other people might find useful, too.
It appears to be something like 8 times faster than using Line Input from my tests.
'This function reads a file into a string. '
'I found this in the book Programming Excel with VBA and .NET. '
Public Function QuickRead(FName As String) As String
Dim I As Integer
Dim res As String
Dim l As Long
I = FreeFile
l = FileLen(FName)
res = Space(l)
Open FName For Binary Access Read As #I
Get #I, , res
Close I
QuickRead = res
End Function
'This function works like the Line Input statement'
Public Sub QRLineInput( _
ByRef strFileData As String, _
ByRef lngFilePosition As Long, _
ByRef strOutputString, _
ByRef blnEOF As Boolean _
)
On Error GoTo LastLine
strOutputString = Mid$(strFileData, lngFilePosition, _
InStr(lngFilePosition, strFileData, vbNewLine) - lngFilePosition)
lngFilePosition = InStr(lngFilePosition, strFileData, vbNewLine) + 2
Exit Sub
LastLine:
blnEOF = True
End Sub
Sub Test()
Dim strFilePathName As String: strFilePathName = "C:\Fld\File.txt"
Dim strFile As String
Dim lngPos As Long
Dim blnEOF As Boolean
Dim strFileLine As String
strFile = QuickRead(strFilePathName) & vbNewLine
lngPos = 1
Do Until blnEOF
Call QRLineInput(strFile, lngPos, strFileLine, blnEOF)
Loop
End Sub
Thanks for the advice!
My two cents…
Not long ago I needed reading large files using VBA and noticed this question. I tested the three approaches to read data from a file to compare its speed and reliability for a wide range of file sizes and line lengths. The approaches are:
Line Input VBA statement
Using the File System Object (FSO)
Using Get VBA statement for the whole file and then parsing the string read as described in posts here
Each test case consists of three steps:
Test case setup that writes a text file containing given number of lines of the same given length filled by the known character pattern.
Integrity test. Read each file line and verify its length and contents.
File read speed test. Read each line of the file repeated 10 times.
As you can notice, Step #3 verifies the true file read speed (as asked in the question) while Step #2 verifies the file read integrity and therefore simulates real conditions when string parsing is needed.
The following chart shows the test results for the File read speed test. The file size is 64M bytes for all tests, and the tests differ in line length that varies from 2 bytes (not including CRLF) to 8M bytes.
CONCLUSION:
All the three methods are reliable for large files with normal and abnormal line lengths (please compare to Graeme Howard’s answer)
All the three methods produce almost equivalent file reading speed for normal line lengths
“Superfast way” (Method #3) works fine for extremely long lines while the other two don’t.
All this is applicable to different Offices, different PCs, for VBA and VB6
You can use Scripting.FileSystemObject to do that thing.
From the Reference:
The ReadLine method allows a script to read individual lines in a text file. To use this method, open the text file, and then set up a Do Loop that continues until the AtEndOfStream property is True. (This simply means that you have reached the end of the file.) Within the Do Loop, call the ReadLine method, store the contents of the first line in a variable, and then perform some action. When the script loops around, it will automatically drop down a line and read the second line of the file into the variable. This will continue until each line has been read (or until the script specifically exits the loop).
And a quick example:
Set objFSO = CreateObject("Scripting.FileSystemObject")
Set objFile = objFSO.OpenTextFile("C:\FSO\ServerList.txt", 1)
Do Until objFile.AtEndOfStream
strLine = objFile.ReadLine
MsgBox strLine
Loop
objFile.Close
Line Input works fine for small files. However, when file sizes reach around 90k, Line Input jumps all over the place and reads data in the wrong order from the source file.
I tested it with different filesizes:
49k = ok
60k = ok
78k = ok
85k = ok
93k = error
101k = error
127k = error
156k = error
Lesson learned - use Scripting.FileSystemObject
With that code you load the file in memory (as a big string) and then you read that string line by line.
By using Mid$() and InStr() you actually read the "file" twice but since it's in memory, there is no problem.
I don't know if VB's String has a length limit (probably not) but if the text files are hundreds of megabyte in size it's likely to see a performance drop, due to virtual memory usage.
I would think , in a large file scenario using a stream would be far more efficient, because memory consumption would be very small.
But your algorithm could alternate between using a stream and loading the entire thing in memory based on the file size. I wouldn't be surprised if one is only better than the other under certain criteria.
'you can modify above and read full file in one go
and then display each line as shown below
Option Explicit
Public Function QuickRead(FName As String) As Variant
Dim i As Integer
Dim res As String
Dim l As Long
Dim v As Variant
i = FreeFile
l = FileLen(FName)
res = Space(l)
Open FName For Binary Access Read As #i
Get #i, , res
Close i
'split the file with vbcrlf
QuickRead = Split(res, vbCrLf)
End Function
Sub Test()
' you can replace file for "c:\writename.txt to any file name you desire
Dim strFilePathName As String: strFilePathName = "C:\writename.txt"
Dim strFileLine As String
Dim v As Variant
Dim i As Long
v = QuickRead(strFilePathName)
For i = 0 To UBound(v)
MsgBox v(i)
Next
End Sub
My take on it...obviously, you've got to do something with the data you read in. If it involves writing it to the sheet, that'll be deadly slow with a normal For Loop. I came up with the following based upon a rehash of some of the items there, plus some help from the Chip Pearson website.
Reading in the text file (assuming you don't know the length of the range it will create, so only the startingCell is given):
Public Sub ReadInPlainText(startCell As Range, Optional textfilename As Variant)
If IsMissing(textfilename) Then textfilename = Application.GetOpenFilename("All Files (*.*), *.*", , "Select Text File to Read")
If textfilename = "" Then Exit Sub
Dim filelength As Long
Dim filenumber As Integer
filenumber = FreeFile
filelength = filelen(textfilename)
Dim text As String
Dim textlines As Variant
Open textfilename For Binary Access Read As filenumber
text = Space(filelength)
Get #filenumber, , text
'split the file with vbcrlf
textlines = Split(text, vbCrLf)
'output to range
Dim outputRange As Range
Set outputRange = startCell
Set outputRange = outputRange.Resize(UBound(textlines), 1)
outputRange.Value = Application.Transpose(textlines)
Close filenumber
End Sub
Conversely, if you need to write out a range to a text file, this does it quickly in one print statement (note: the file 'Open' type here is in text mode, not binary..unlike the read routine above).
Public Sub WriteRangeAsPlainText(ExportRange As Range, Optional textfilename As Variant)
If IsMissing(textfilename) Then textfilename = Application.GetSaveAsFilename(FileFilter:="Text Files (*.txt), *.txt")
If textfilename = "" Then Exit Sub
Dim filenumber As Integer
filenumber = FreeFile
Open textfilename For Output As filenumber
Dim textlines() As Variant, outputvar As Variant
textlines = Application.Transpose(ExportRange.Value)
outputvar = Join(textlines, vbCrLf)
Print #filenumber, outputvar
Close filenumber
End Sub
Be careful when using Application.Transpose with a huge number of values. If you transpose values to a column, excel will assume you are assuming you transposed them from rows.
Max Column Limit < Max Row Limit, and it will only display the first (Max Column Limit) values, and anithing after that will be "N/A"
I just wanted to share some of my results...
I have text files, which apparently came from a Linux system, so I only have a vbLF/Chr(10) at the end of each line and not vbCR/Chr(13).
Note 1:
This meant that the Line Input method would read in the entire file, instead of just one line at a time.
From my research testing small (152KB) & large (2778LB) files, both on and off the network I found the following:
Open FileName For Input: Line Input was the slowest (See Note 1 above)
Open FileName For Binary Access Read: Input was the fastest for reading the whole file
FSO.OpenTextFile: ReadLine was fast, but a bit slower then Binary Input
Note 2:
If I just needed to check the file header (first 1-2 lines) to check if I had the proper file/format, then FSO.OpenTextFile was the
fastest, followed very closely by Binary Input.
The drawback with the Binary Input is that you have to know how many characters
you want to read.
On normal files, Line Input would also be a good
option as well, but I couldn't test due to Note 1.
Note 3:
Obviously, the files on the network showed the largest difference in read speed. They also showed the greatest benefit from reading the file a second time (although there are certainly memory buffers that come into play here).