Function to count number of lines in a text file - scripting

Need a function that will accept a filename as parameter and then return the number of lines in that file.
Should be take under 30 seconds to get the count of a 10 million line file.
Currently have something along the lines of - but it is too slow with large files:
Dim objFSO, strTextFile, strData, arrLines, LineCount
CONST ForReading = 1
'name of the text file
strTextFile = "sample.txt"
'Create a File System Object
Set objFSO = CreateObject("Scripting.FileSystemObject")
'Open the text file - strData now contains the whole file
strData = objFSO.OpenTextFile(strTextFile,ForReading).ReadAll
'Split by lines, put into an array
arrLines = Split(strData,vbCrLf)
'Use UBound to count the lines
LineCount = UBound(arrLines) + 1
wscript.echo LineCount
'Cleanup
Set objFSO = Nothing

If somebody still looking for faster way, here is the code:
Const ForAppending = 8
Set fso = CreateObject("Scripting.FileSystemObject")
Set theFile = fso.OpenTextFile("C:\textfile.txt", ForAppending, Create:=True)
WScript.Echo theFile.Line
Set Fso = Nothing
Of course, the processing time depend very much of the file size, not only of the lines number. Compared with the RegEx method TextStream.Line property is at least 3 times quicker.

The only alternative I see is to read the lines one by one (EDIT: or even just skip them one by one) instead of reading the whole file at once. Unfortunately I can't test which is faster right now. I imagine skipping is quicker.
Dim objFSO, txsInput, strTemp, arrLines
Const ForReading = 1
Set objFSO = CreateObject("Scripting.FileSystemObject")
strTextFile = "sample.txt"
txsInput = objFSO.OpenTextFile(strTextFile, ForReading)
'Skip lines one by one
Do While txsInput.AtEndOfStream <> True
txsInput.SkipLine ' or strTemp = txsInput.ReadLine
Loop
wscript.echo txsInput.Line-1 ' Returns the number of lines
'Cleanup
Set objFSO = Nothing
Incidentally, I took the liberty of removing some of your 'comments. In terms of good practice, they were superfluous and didn't really add any explanatory value, especially when they basically repeated the method names themselves, e.g.
'Create a File System Object
... CreateObject("Scripting.FileSystemObject")

Too large files...
The following is the fastest-effeciently way I know of:
Dim oFso, oReg, sData, lCount
Const ForReading = 1, sPath = "C:\file.txt"
Set oReg = New RegExp
Set oFso = CreateObject("Scripting.FileSystemObject")
sData = oFso.OpenTextFile(sPath, ForReading).ReadAll
With oReg
.Global = True
.Pattern = "\r\n" 'vbCrLf
'.Pattern = "\n" ' vbLf, Unix style line-endings
lCount = .Execute(sData).Count + 1
End With
WScript.Echo lCount
Set oFso = Nothing
Set oReg = Nothing

You could try some variation on this
cnt = 0
Set fso = CreateObject("Scripting.FileSystemObject")
Set theFile = fso.OpenTextFile(filespec, ForReading, False)
Do While theFile.AtEndOfStream <> True
theFile.SkipLine
c = c + 1
Loop
theFile.Close
WScript.Echo c,"lines"

txt = "c:\YourTxtFile.txt"
j = 0
Dim read
Open txt For Input As #1
Do While Not EOF(1)
Input #1, read
j = j + 1
Loop
Close #1
If it adds an empty last line the result is (j - 1).
It works fine for one column in the txt file.

How to count all lines in the notepad
Answers:
=> Below is the code -
Set t1=createObject("Scripting.FileSystemObject")
Set t2=t1.openTextFile ("C:\temp\temp1\temp2_VBSCode.txt",1)
Do Until t2.AtEndOfStream
strlinenumber = t2.Line
strLine = t2.Readline
Loop
msgbox strlinenumber
t2.Close

I was looking for a faster way than what I already had to determine the number of lines in a text file. I searched the internet and came across 2 promising solution. One was a solution based on SQL thew other the solution I found here based on Fso by Kul-Tigin. I tested them and this is part of the result:
Number of lines Time elapsed Variant
--------------------------------------------------------
110 00:00:00.70 SQL
110 00:00:00.00 Vanilla VBA (my solution)
110 00:00:00.16 FSO
--------------------------------------------------------
1445014 00:00:17.25 SQL
1445014 00:00:09.19 Vanilla VBA (my solution)
1445014 00:00:17.73 FSO
I ran this several times with large and small numbers. Time and again the vanilla VBA came out on top. I know this is far out of date, but for anyone still looking for the fastest way to determine the number of lines in a csv/text file, down here's the code I use.
Public Function GetNumRecs(ASCFile As String) As Long
Dim InStream As Long
Dim Record As String
InStream = FreeFile
GetNumRecs = 0
Open ASCFile For Input As #InStream
Do While Not EOF(InStream)
Line Input #InStream, Record
GetNumRecs = GetNumRecs + 1
Loop
Close #InStream
End Function

Related

Best way to delete bad record from csv file using VBA

The following script works by adding the bad records to the exception my file.
I just need the best way to delete the bad records from the source file after it's copied.
Public Sub readopentextfile()
Const ForReading = 1, ForWriting = 2, ForAppending = 8
Dim oFS, oFSO, oFSW
Dim stext As String
Set oFSO = CreateObject("Scripting.FileSystemObject")
'/ Open new DailySalesOrds for reading
Set oFS = oFSO.OpenTextFile("C:\DailySalesOrds.csv", ForReading)
'/ Open exception file for writing
Set oFSW = oFSO.OpenTextFile("C:\DailySalesOrderExcep.csv", ForAppending)
'/ Read each record and count commas if 21 or more add record to exception file
Do Until oFS.AtEndOfStream
stext = oFS.ReadLine
Do
c = InStr(c + 1, stext, ",")
If c <> 0 Then Count = Count + 1
Loop Until c = 0
'Debug.Print Count
If Count >= 21 Then
oFSW.Write (vbNewLine & stext)
End If
c = 0
Count = 0
Loop
End Sub
Write the good lines to a new file
Rename the source file with the extension .bak
Rename the new file as the original source file

Reading text files with specific prefix

I have a folder with lots of text files each containing (but in random order) :
A = ...
B = ...
C = ...
Now I would like to import these text files into an excel-spreadsheet,
where each of the prefixes is organized in the colums, and the files are listed as rows
Example: 2 files
File 1:
A = 1
B = 2
C = 3
File 2:
A = 4
B = 5
C = 6
I would the excel to look like :
NR / A / B / C
1 / 1 /2 /3
2 / 4/ 5 /6
I am still learning VB, and this is just a bit over the top for me.
I have found a macro like this:
Sub Read_Text_Files()
Dim sPath As String, sLine As String
Dim oPath As Object, oFile As Object, oFSO As Object
Dim r As Long
'Files location
sPath = "C:\Test\"
r = 1
Set oFSO = CreateObject( _
"Scripting.FileSystemObject")
Set oPath = oFSO.GetFolder(sPath)
Application.ScreenUpdating = False
For Each oFile In oPath.Files
If LCase(Right(oFile.Name, 4)) = ".txt" Then
Open oFile For Input As #1 ' Open file for input.
Do While Not EOF(1) ' Loop until end of file.
Input #1, sLine ' Read data
If Left(sLine, 1) = "A=" Then 'Now i need to write this to the first column of that row
If Left(sLine, 1) = "B=" Then 'For the second column.
Range("A" & r).Formula = sLine ' Write data line
r = r + 1
Loop
Close #1 ' Close file.
End If
Next oFile
Application.ScreenUpdating = True
End Sub
Do you know how to open files in VBA for reading using syntax like Open and Line Input?
If not, read this: https://stackoverflow.com/a/11528932/2832561
I found this by googling for "VBA open file read"
Do you know how to work with and parse strings (and arrays) using functions like Mid, Left, Right, Split and Join?
If not, try reading this: http://www.exceltrick.com/formulas_macros/vba-split-function/
I found this by googling for "VBA String functions parse text"
Do you know how to work with Workbook and Worksheet objects and assign values to Range objects in Excel?
If not, try reading this: http://www.anthony-vba.kefra.com/vba/vbabasic2.htm
I found this by googling for "Workbook Worksheet Range VBA"
Once you have had a chance to try putting together a solution using these pieces, you can post specific questions on any issues you run into.

Free up Memory: How to delete variables once am don with them- VBA VB ACCESS

How do i free up Memory?
Say I have a string
Dim TestStri As String
TestStri = "Test"
' What do i have to type up to get rid of the variable?
' I know
TestStri = Nothing
' will give it the default value, but the variable is still there.
Can I use the same Method for other variables i.e. Long, int etc.
I'm assuming you are referring to VB6 and VBA as indicated by your title, not VB.Net, as indicated by a keyword.
In VB6 and VBA the memory consumption of a string variable consists of a fixed part for the string's length and a terminator and a variable length part for the string contents itself. See http://www.aivosto.com/vbtips/stringopt2.html#memorylayout for a good explanation of this.
So, when you set the string variable to an empty string or vbNullString, you will be freeing up the variable part of the string but not the fixed part.
Other types like long, int, bool and date consume a fixed amount of memory.
You can't "free" local variables in VB completely (come to think of it, is there ANY programming language where you can do that?), and for the most part, you wouldn't care because the local variables themselves (the fixed portion) is usually very small.
The only case I can think of where the memory consumption of local varibles could get big is if you have recursive function calls with deep recursion/wide recursion.
I went a differs route :
I was hoping MemoryUsage would be useful. It wasn't, apparently...
I run a vba script that goes through multiple files (since access cannot handle anything too large); and append them to a table, transform it and then spit out a summary.
The script loops through files and runs macros against each of them.
The quick answer is to pull the memory usage from the task manager and then if it exceeds 1 GB; pause the subroutine so no corrupt records get in.
How do we do this?
Insert this memory usage Function with the readfile function.
You will need to create an if statement in your code that says:
dim memory as long
memory = memory_usage
' 1000000 ~ 1 GB
If memory > 1000000 then
End Sub
end if
=================================================
[path to file] = "C:\….\ShellOutputfile.txt"
Function Memory_Usage() as Long
Dim lines As Long
Dim linestring As String
Shell "tasklist /fi " & """IMAGENAME EQ MSACCESS.EXE""" & ">" & """[path to file]"""
'get_list_data
lines = CInt(get_listing_data("[path to file]", 1, 0))
linestring = get_listing_data("[path to file]", 2, 4)
linestring = Right(linestring, 11)
linestring = Replace(linestring, " K", "") ' K
linestring = Replace(linestring, " ", "")
lines = CLng(linestring)
Memory_Usage = lines
End Function
=============================
Public Function get_listing_data(PATH As String, Choice As Integer, typeofreading As Integer) As String
' parse in the variable, of which value you need.
Const ForReading = 1, ForWriting = 2, ForAppending = 8
Dim tmp_var_str As String
Dim fso, ts, fileObj, filename
Dim textline As String
Dim tmp_result As String
Dim TMP_PATH As String
Dim tmpchoice As Integer
Dim tor As Integer
Dim counter As Integer
' type of reading determines what loop is used
' type of reading = 0; to bypass; > 0, you are choosing a line to read.
counter = 0
TMP_PATH = PATH
tmp_var_str = var_str
tmp_result = ""
tor = typeofreading
' choice = 1 (count the lines)
' choice = 2 (read a specific line)
tmpchoice = Choice
' Create the file, and obtain a file object for the file.
If Right(PATH, 1) = "\" Then TMP_PATH = Left(PATH, Len(PATH) - 1)
filename = TMP_PATH '& "\Profit_Recognition.ini"
Set fso = CreateObject("Scripting.FileSystemObject")
Set fileObj = fso.GetFile(filename)
' Open a text stream for output.
Set ts = fileObj.OpenAsTextStream(ForReading, TristateUseDefault)
Do While ts.AtEndOfStream <> True
If tmpchoice = 1 Then
counter = counter + 1
textline = ts.ReadLine
tmp_result = CStr(counter)
End If
If tmpchoice = 2 Then
counter = counter + 1
tmp_result = ts.ReadLine
If counter = tor Then
Exit Do
End If
End If
Loop
get_listing_data = tmp_result
End Function

VBScript to read specific line and extract characters and store it as a variable

I have a VB script that reads the 11th line from a text file. However from that line I need to extract characters 48 through 53 and save it as a variable. After this is accomplished I would like to use that variable and use it in a web url. Example below:
Contents of the szCPUSer.dat file look like this:
The script I have reads the 10th line
Const ForReading = 1
Set objFSO = CreateObject("Scripting.FileSystemObject")
Set objTextFile = objFSO.OpenTextFile("szCPUSer.dat", ForReading)
For i = 1 to 10
objTextFile.ReadLine
Next
strLine = objTextFile.ReadLine
Wscript.Echo strLine
objTextFile.Close
I need the script to extract 03187 from the 11th line than store it as a variable SerNum. after this I would like to use that number extracted in a url for example:
http://seriallookup.com/serial=SerNum
The following works!
Const ForReading = 1
Set objFSO = CreateObject("Scripting.FileSystemObject")
Set objTextFile = objFSO.OpenTextFile("szCPUSer.dat", ForReading)
For i = 1 to 10
objTextFile.ReadLine
Next
strLine = objTextFile.ReadLine
Wscript.Echo strLine
objTextFile.Close
'Gets 6 chars starting from Right side
SerNum = Right(strLine, 6)
'Gets 6 chars starting from Left side
SerNum = Left(SerNum, 5)
'Wscript.Echo SerNum
url = "http://seriallookup.com/serial=" & SerNum
Wscript.Echo url
Take a look at InStr function. It lets you search for a substring.
http://www.w3schools.com/vbscript/func_instr.asp
Then you can use the Right function to parse out the end bit of the line.
You can also look at the Split function so you can parse the lines into arrays and deal with it that way which would be best.
http://www.w3schools.com/vbscript/func_split.asp

Creating Fixed Width files from strings

I have searched high and low on the internet and I can't find a straight answer to this !
I have a file that has approx 100,000 characters in one long line.
I need to read this file in and write it out again in its entirety, in lines 102 character long ending with VbCrLf. There are no delimiters.
I thought there were a number of ways to tackle issues like this in VB Script... but
apparently not !
Can anyone please provide me with a pointer ?
Here's something (off the top of my head - untested!) that should get you started.
Const ForReading = 1
Const ForWriting = 2
Dim sNewLine
Set fso = CreateObject("Scripting.FileSystemObject")
Set tsIn = fso.OpenTextFile("OldFile.txt", ForReading) ' Your input file
Set tsOut = fso.OpenTextFile("NewFile.txt", ForWriting) ' New (output) file
While Not tsIn.AtEndOfStream ' While there is still text
sNewLine = tsIn.Read(102) ' Read 120 characters
tsOut.Write sNewLine & vbCrLf ' Write out to new file + CR/LF
Wend ' Loop to repeat
tsIn.Close
tsOut.Close
I won't cover the reading of files, since that is stuff you can find everywhere. And since it's been years I've coded in vb or vbscript, I hope that .net code will suffice.
pseudo: read line from file, put it in for example a string (performance issues anyone?).
A simple algorithm would be and this might have performance issues (multithreading, parallel could be a solution):
Public Sub foo()
Dim strLine As String = "foo²"
Dim strLines As List(Of String) = New List(Of String)
Dim nrChars = strLine.ToCharArray.Count
Dim iterations = nrChars / 102
For i As Integer = 0 To iterations - 1
strLines.Add(strLine.Substring(0, 102))
strLine = strLine.Substring(103)
Next
'save it to file
End Sub