Import specific lines from text file into Excel spreadsheet using VBA

Import specific lines from text file into Excel spreadsheet using VBA - vba

I'm trying to get a macro setup that will import specific lines from a text file into a Excel spreadsheet. I am currently using the instr function to locate a specific word then read how many letters over I need to import data into the cells.
The reason I am doing it this way is due to the file being over 3500 lines and is not delimited or comma separated in any sense. Some of the data is the same as well which I run into problems with the above tactic.
What I need help with is how to import only like 20 specific lines into the spreadsheet(multiple sheets will be used, but can reuse this code), while using a technique like I mentioned earlier so I can decide where its reading from.
Thanks!

This is what I use constantly. Just replace the file location with your text file and this little blip will read it line by line. Then you can throw logic against a single line and decide what to do with it. Its a good solution when your dealing with data that regex statements can't handle accurately.
Sub LoadSettings()
Dim fso As Scripting.FileSystemObject
Dim F As File
Dim F2 As TextStream
Dim TS As TextStream
Dim lngCount As Long
Set fso = New Scripting.FileSystemObject
fileloc = "Whereyourtextfileislocated"
Set F = fso.GetFile(fileloc)
Set TS = F.OpenAsTextStream(1, -2)
S = ""
Do
RptText = TS.ReadLine
if rpttext = whatever criteria, or even if instr(rpttext,"whateveryouwant") >0 then
do something with rpttext
end if
Loop
End Sub

Related

Using 'FileLen' with a four digit file extension

I am working on a routine that gets the filelength for each of a large number of image files. When the routine runs file length against most files it works perfectly but some of the images have the file extension '.jpeg' and the FileLen command produces a 'File not found' error for these files. The code line I'm using is:
ActiveCell.Offset(ColumnOffset:=2).Value = FileLen(D & N)
Where D is a text variable containing the Drive Letter and N is a text variable containing the path and filename.
I have tested the string variables and they are supplying the correct full path and filename to the FileLen command. I have also set up a test routine to check with other files and this produces the same result. Am I correct in assuming that FileLen does not work with 4 digit file extensions? Is there a simple way round the issue?
The routine will be checking and comparing around 240,000 files with a fair proportion being .jpeg so going in and changing the extensions isn't an option.
Rob

FileLen can handle extensions with more than 3 characters, so that's not your problem.
Assuming that your values for D and N are correct (you should consider to use more meaningful names for your variables), I can imagine that it may be confused because of interference between short and long name of a file, but I cannot prove this.
You could try to use the FileSystemObject as alternative. Add a reference to the scripting runtime and use:
Option Explicit
Dim fso As FileSystemObject
Function getFSO() As FileSystemObject
' Create object only if neccessary
If fso Is Nothing Then Set fso = New FileSystemObject
Set getFSO = fso
End Function
Function getFilesize(filename As String) As Long
' Return the size of a file or -1 if not found or any error
getFilesize = -1
On Error Resume Next
getFilesize = getFSO.GetFile(filename).Size
On Error GoTo 0
End Function
Usage:
ActiveCell.Offset(ColumnOffset:=2).Value = getFilesize(D & N)

Import semicolon separated CSV file using VBA without rename to txt

This is not a duplicate since I want a solution not constisting in reformatting file to txt:
my intention is to open a csv file using semicolon as delimiter. For that purpose I have used the following code:
Sub prueba2()
Dim sfile As String
Dim wb As Workbook
Dim Path As String
Dim Namefile As String
Path = "V:\evfilesce9i9\apps9\vbe9\dep4\KFTP\KFTP001D_FicherosCeca"
Namefile = "\QryCECARFSECTORIAL0239*.txt"
Set wb = Workbooks.Open(Filename:=Path & Namefile, Delimiter:=";")
End Sub
When I try it, it is opened using commas as delimiter instead of which I have specified (semicolon)
I have read in other questions that this is normal in post 2006 Excel versions, and that the fastest solution is to reformat file to a txt.
This does not fit into my needs because I have to do it without changing format. I don't find any solution.
Could someone help me?

Please see the MS documentation here.
I think you want to use the Format parameter, and not the delimiter parameter.
Try:
Set wb = Workbooks.Open(Filename:=Path & Namefile, Format:=4)
It seems like the Delimiter argument is only used if Format is set to 6, which signifies a custom delimiter character. Semi-colon is a standard delimiter.
Edit:
Hmm... so, this seems to be something that's been tricky in Excel/VBA for a while.
After some more research, the "Format" option may only be used when opening .txt files. Which is why the "reformat file to .txt" is one possible solution.
There are some things that can be done, however.
Excel will handle opening a semicolon delimited file well if the first line of the file is:
sep=;
I know you said you could not reformat the files, but is that something that you can do?
If not, the next things I would suggest would be to either: 1) use the Open Statement to open your file and then write it to a temporary file (perhaps as a .txt), to be reopened with the original Workbooks.Open(Format:=4), or 2) write your own text importer. A sample text importer can be found in this stackoverflow page.
Sub ImportCSVFile(filepath As String)
Dim line As String
Dim arrayOfElements
Dim linenumber As Integer
Dim elementnumber As Integer
Dim element As Variant
linenumber = 0
elementnumber = 0
Open filepath For Input As #1 ' Open file for input
Do While Not EOF(1) ' Loop until end of file
linenumber = linenumber + 1
Line Input #1, line
arrayOfElements = Split(line, ";")
elementnumber = 0
For Each element In arrayOfElements
elementnumber = elementnumber + 1
Cells(linenumber, elementnumber).Value = element
Next
Loop
Close #1 ' Close file.
End Sub

Access export query as text with header and footer

I've been researching this issue high and low for an answer or at least a template to go by.
I am using MS Access 2007. I need to export a query as a text file with fixed width specifications (already done). The problem(s) I am running into, is that I must have a specific header and footer appended to the export. Header must have current date and trailer must have total items being exported.
I am admittedly in over my head, but usually can stumble along with some VBA code that does something similar.
Can anyone help?

There isn't any way to define extra lines of text in an export.
I assume you are using the TransferSpreadsheet Method to export your query in fixed-width format. That's typically the right approach for generating the fixed-width content, with or without field headers.
But if you want to add lines to the file before and after the data content, then you'll need to open the existing file, create a new file, append the header lines, then append the data from the existing file to the new file, and then append the footer lines, then close both files.
You could use the built-in VBA functions for working with files, but I find the Scripting.Runtime library offers more intuitive, object-oriented ways of working with files.
You'll need to add a reference to the Microsoft Scripting Runtime library in Tools.. References..
Sub EnhanceExportedFile()
Const exportedFilePath As String = "C:\Foo.txt"
Const newFilePath As String = "C:\NewFoo.txt"
Dim fso As Scripting.FileSystemObject
Dim exportedFile As TextStream
Dim newFile As TextStream
Dim rowCount As Long
Set fso = New Scripting.FileSystemObject
Set exportedFile = fso.OpenTextFile(exportedFilePath, ForReading, False)
Set newFile = fso.CreateTextFile(newFilePath, True)
'Append the date in ISO format
newFile.WriteLine Format(Now, "yyyy-mm-dd")
'Append each line in the exported file
Do While Not exportedFile.AtEndOfStream
newFile.WriteLine exportedFile.ReadLine
rowCount = rowCount + 1
Loop
'Append the total exported lines
newFile.WriteLine rowCount
'Close both files
exportedFile.Close
newFile.Close
End Sub

Use a union query.
Suppose your query has fields ID (auto number, long), first name, lastname and your tabelname is tablexx. If you have sequential ids it could be something like this:
Create a query.
Select 0 as id, format(date(),"dd/mm/yyyy") as firstname, "" as lastname, "" as nextfield etc etc from tablexx order by id;
and a query
Select 9999999999 (much bigger than your expected id) as id, (select count(id) from tablexx) as firstname, "" as lastname, "" as nextfield etc etc from tablexx order by id;
Now do a union of the three. Even blank lines can be put in (id = 1 etc).

reverse engineer vba code excel

I am not a VBA programmer. However, I have the 'unpleasant' task of re-implementing someones VBA code in another language. The VBA code consists of 75 modules which use one massive 'calculation sheet' to store all 'global variables'. So instead of using descriptive variable names, it often uses:
= Worksheets("bla").Cells(100, 75).Value
or
Worksheets("bla").Cells(100, 75).Value =
To make things worse, the 'calculation sheet' also contains some formulas.
Are there any (free) tools which allow you to reverse engineer such code (e.g. create Nassi–Shneiderman diagram, flowcharts)? Thanks.

I think #JulianKnight 's suggestion should work
Building on this, you could:
Copy all the code to a text editor capable of RegEx search/replace (Eg. Notepad++).
Then use the RegEx search/Replace with a search query like:
Worksheets\(\"Bla\"\).Cells\((\d*), (\d*)\).Value
And replace with:
Var_\1_\2
This will convert all the sheet stored values to variable names with row column indices.
Example:
Worksheets("bla").Cells(100, 75).Value To Var_100_75
These variables still need to be initialized.
This may be done by writing a VBA code which simply reads every (relevant) cell in the "Bla" worksheet and writes it out to a text file as a variable initialization code.
Example:
Dim FSO As FileSystemObject
Dim FSOFile As TextStream
Dim FilePath As String
Dim col, row As Integer
FilePath = "c:\WriteTest.txt" ' create a test.txt file or change this
Set FSO = New FileSystemObject
' opens file in write mode
Set FSOFile = FSO.OpenTextFile(FilePath, 2, True)
'loop round adding lines
For col = 1 To Whatever_is_the_column_limit
For row = 1 To Whatever_is_the_row_limit
' Construct the output line
FSOFile.WriteLine ("Var_" & Str(row) & "_" & Str(col) & _
" = " & Str(Worksheets("Bla").Cells(row, col).Value))
Next row
Next col
FSOFile.Close
Obviously you need to correct the output line syntax and variable name structure for whatever other language you need to use.
P.S. If you are not familiar with RegEx (Regular Expressions), you will find a plethora of articles on the web explaining it.

What is a superfast way to read large files line-by-line in VBA?

I believe I have come up with a very efficient way to read very, very large files line-by-line. Please tell me if you know of a better/faster way or see room for improvement. I am trying to get better at coding, so any sort of advice you have would be nice. Hopefully this is something that other people might find useful, too.
It appears to be something like 8 times faster than using Line Input from my tests.
'This function reads a file into a string. '
'I found this in the book Programming Excel with VBA and .NET. '
Public Function QuickRead(FName As String) As String
Dim I As Integer
Dim res As String
Dim l As Long
I = FreeFile
l = FileLen(FName)
res = Space(l)
Open FName For Binary Access Read As #I
Get #I, , res
Close I
QuickRead = res
End Function
'This function works like the Line Input statement'
Public Sub QRLineInput( _
ByRef strFileData As String, _
ByRef lngFilePosition As Long, _
ByRef strOutputString, _
ByRef blnEOF As Boolean _
)
On Error GoTo LastLine
strOutputString = Mid$(strFileData, lngFilePosition, _
InStr(lngFilePosition, strFileData, vbNewLine) - lngFilePosition)
lngFilePosition = InStr(lngFilePosition, strFileData, vbNewLine) + 2
Exit Sub
LastLine:
blnEOF = True
End Sub
Sub Test()
Dim strFilePathName As String: strFilePathName = "C:\Fld\File.txt"
Dim strFile As String
Dim lngPos As Long
Dim blnEOF As Boolean
Dim strFileLine As String
strFile = QuickRead(strFilePathName) & vbNewLine
lngPos = 1
Do Until blnEOF
Call QRLineInput(strFile, lngPos, strFileLine, blnEOF)
Loop
End Sub
Thanks for the advice!

My two cents…
Not long ago I needed reading large files using VBA and noticed this question. I tested the three approaches to read data from a file to compare its speed and reliability for a wide range of file sizes and line lengths. The approaches are:
Line Input VBA statement
Using the File System Object (FSO)
Using Get VBA statement for the whole file and then parsing the string read as described in posts here
Each test case consists of three steps:
Test case setup that writes a text file containing given number of lines of the same given length filled by the known character pattern.
Integrity test. Read each file line and verify its length and contents.
File read speed test. Read each line of the file repeated 10 times.
As you can notice, Step #3 verifies the true file read speed (as asked in the question) while Step #2 verifies the file read integrity and therefore simulates real conditions when string parsing is needed.
The following chart shows the test results for the File read speed test. The file size is 64M bytes for all tests, and the tests differ in line length that varies from 2 bytes (not including CRLF) to 8M bytes.
CONCLUSION:
All the three methods are reliable for large files with normal and abnormal line lengths (please compare to Graeme Howard’s answer)
All the three methods produce almost equivalent file reading speed for normal line lengths
“Superfast way” (Method #3) works fine for extremely long lines while the other two don’t.
All this is applicable to different Offices, different PCs, for VBA and VB6

You can use Scripting.FileSystemObject to do that thing.
From the Reference:
The ReadLine method allows a script to read individual lines in a text file. To use this method, open the text file, and then set up a Do Loop that continues until the AtEndOfStream property is True. (This simply means that you have reached the end of the file.) Within the Do Loop, call the ReadLine method, store the contents of the first line in a variable, and then perform some action. When the script loops around, it will automatically drop down a line and read the second line of the file into the variable. This will continue until each line has been read (or until the script specifically exits the loop).
And a quick example:
Set objFSO = CreateObject("Scripting.FileSystemObject")
Set objFile = objFSO.OpenTextFile("C:\FSO\ServerList.txt", 1)
Do Until objFile.AtEndOfStream
strLine = objFile.ReadLine
MsgBox strLine
Loop
objFile.Close

Line Input works fine for small files. However, when file sizes reach around 90k, Line Input jumps all over the place and reads data in the wrong order from the source file.
I tested it with different filesizes:
49k = ok
60k = ok
78k = ok
85k = ok
93k = error
101k = error
127k = error
156k = error
Lesson learned - use Scripting.FileSystemObject

With that code you load the file in memory (as a big string) and then you read that string line by line.
By using Mid$() and InStr() you actually read the "file" twice but since it's in memory, there is no problem.
I don't know if VB's String has a length limit (probably not) but if the text files are hundreds of megabyte in size it's likely to see a performance drop, due to virtual memory usage.

I would think , in a large file scenario using a stream would be far more efficient, because memory consumption would be very small.
But your algorithm could alternate between using a stream and loading the entire thing in memory based on the file size. I wouldn't be surprised if one is only better than the other under certain criteria.

'you can modify above and read full file in one go
and then display each line as shown below
Option Explicit
Public Function QuickRead(FName As String) As Variant
Dim i As Integer
Dim res As String
Dim l As Long
Dim v As Variant
i = FreeFile
l = FileLen(FName)
res = Space(l)
Open FName For Binary Access Read As #i
Get #i, , res
Close i
'split the file with vbcrlf
QuickRead = Split(res, vbCrLf)
End Function
Sub Test()
' you can replace file for "c:\writename.txt to any file name you desire
Dim strFilePathName As String: strFilePathName = "C:\writename.txt"
Dim strFileLine As String
Dim v As Variant
Dim i As Long
v = QuickRead(strFilePathName)
For i = 0 To UBound(v)
MsgBox v(i)
Next
End Sub

My take on it...obviously, you've got to do something with the data you read in. If it involves writing it to the sheet, that'll be deadly slow with a normal For Loop. I came up with the following based upon a rehash of some of the items there, plus some help from the Chip Pearson website.
Reading in the text file (assuming you don't know the length of the range it will create, so only the startingCell is given):
Public Sub ReadInPlainText(startCell As Range, Optional textfilename As Variant)
If IsMissing(textfilename) Then textfilename = Application.GetOpenFilename("All Files (*.*), *.*", , "Select Text File to Read")
If textfilename = "" Then Exit Sub
Dim filelength As Long
Dim filenumber As Integer
filenumber = FreeFile
filelength = filelen(textfilename)
Dim text As String
Dim textlines As Variant
Open textfilename For Binary Access Read As filenumber
text = Space(filelength)
Get #filenumber, , text
'split the file with vbcrlf
textlines = Split(text, vbCrLf)
'output to range
Dim outputRange As Range
Set outputRange = startCell
Set outputRange = outputRange.Resize(UBound(textlines), 1)
outputRange.Value = Application.Transpose(textlines)
Close filenumber
End Sub
Conversely, if you need to write out a range to a text file, this does it quickly in one print statement (note: the file 'Open' type here is in text mode, not binary..unlike the read routine above).
Public Sub WriteRangeAsPlainText(ExportRange As Range, Optional textfilename As Variant)
If IsMissing(textfilename) Then textfilename = Application.GetSaveAsFilename(FileFilter:="Text Files (*.txt), *.txt")
If textfilename = "" Then Exit Sub
Dim filenumber As Integer
filenumber = FreeFile
Open textfilename For Output As filenumber
Dim textlines() As Variant, outputvar As Variant
textlines = Application.Transpose(ExportRange.Value)
outputvar = Join(textlines, vbCrLf)
Print #filenumber, outputvar
Close filenumber
End Sub

Be careful when using Application.Transpose with a huge number of values. If you transpose values to a column, excel will assume you are assuming you transposed them from rows.
Max Column Limit < Max Row Limit, and it will only display the first (Max Column Limit) values, and anithing after that will be "N/A"

I just wanted to share some of my results...
I have text files, which apparently came from a Linux system, so I only have a vbLF/Chr(10) at the end of each line and not vbCR/Chr(13).
Note 1:
This meant that the Line Input method would read in the entire file, instead of just one line at a time.
From my research testing small (152KB) & large (2778LB) files, both on and off the network I found the following:
Open FileName For Input: Line Input was the slowest (See Note 1 above)
Open FileName For Binary Access Read: Input was the fastest for reading the whole file
FSO.OpenTextFile: ReadLine was fast, but a bit slower then Binary Input
Note 2:
If I just needed to check the file header (first 1-2 lines) to check if I had the proper file/format, then FSO.OpenTextFile was the
fastest, followed very closely by Binary Input.
The drawback with the Binary Input is that you have to know how many characters
you want to read.
On normal files, Line Input would also be a good
option as well, but I couldn't test due to Note 1.
 
Note 3:
Obviously, the files on the network showed the largest difference in read speed. They also showed the greatest benefit from reading the file a second time (although there are certainly memory buffers that come into play here).

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas