How to save a Unicode character to a text file - vba

This is in Word for MAC VBA. I want to save the Unicode character from a text box to text file. For example this character "⅛".
I use this code.
Dim N as Long
N = FreeFile
Dim strText as String
strText = Textbox1.Text 'This is what is in the textbox "⅛"
Open <file path> For Output As N
Print #N, strText
Close N
It does not save the Unicode character. I understand I have to change the text encoding format. How do I do that?
Likewise, how to read the text file with the Unicode format?

I hope this will fit VBA for Word on Mac as well, but on Windows I have the CreateTextFile method of the FileSystemObject (see MSDN doc). There I can define to create a unicode text file.
Set fsObject = CreateObject("Scripting.FileSystemObject")
Set xmlFile = fsObject.CreateTextFile("path/filename.txt", True, True) 'the second "true" forces a unicode file.
xmlFile.write "YourUnicodeTextHere"
xmlFile.close

VBA can't code text in UTF-8 this way. Use ADODB - yes, for text, not for database.
'ensure reference is set to Microsoft ActiveX DataObjects library
'(the latest version of it) under "tools/references"
Sub AdoTest()
Dim adoStream As ADODB.Stream
Set adoStream = New ADODB.Stream
'Unicode coding
adoStream.Charset = "Unicode" 'or any string listed in registry HKEY_CLASSES_ROOT\MIME\Database\Charset
'open sream
adoStream.Open
'write a text
adoStream.WriteText "Text for testing: ěšč", StreamWriteEnum.stWriteLine
'save to file
adoStream.SaveToFile "D:\a\ado.txt"
adoStream.Close
End Sub
Reading is simplier, see my answer here:
Unicode and UTF-8 with VBA
Edited: I've inserted complete example.
Edited 2: Added refernce to list of coding in the registry

The question is for VBA on Mac, and I'm afraid none of the answers work on a Mac.
The question is about Unicode which comes in many flavours. I'll address the UTF-16 aspect of it. UTF-8 follows a different path, but it isn't difficult too. AFAIU, your question is about UTF-16 string.
The code below has no error handling, I'll let you take care of that.
Function writeUnicodeTextToFile(filePathName As String, myText As String)
`Dim myFileNumber As Long, I As Long, byteArray() As Byte
myFileNumber = FreeFile()
Open filePathName For Binary As #myFileNumber
ReDim byteArray(1)
' Create a BOM for your Unicode flavour
' (CHOOSE! one of the two, programmatically, or hard-code it)
' => Little Endian
byteArray(0) = 255: byteArray(1) = 254
' => Big Endian
'byteArray(0) = 254: byteArray(1) = 255
' now write the two-byte BOM
Put myFileNumber, 1, byteArray
' redimension your byte array
' note it works even if you don't Redim (go figure) but it's more elegant
I = (LenB(myText) / 2) - 1
ReDim byteArray(I)
' populate the byte array...
byteArray = myText
' ... and write you text AFTER the BOM
Put myFileNumber, 3, byteArray
Close #myFileNumber
End Function

Here is a VBA routine that takes a string as input (your text), and fills an array of bytes. Then you write that array to disk in binary mode, making sure you start writing it after the first three bytes (BOM).
You'll need those Public variables:
byteArray() As Byte, regexUTF8 As String
Sub testing()
' creating the BOM
Dim bom(2) As Byte, someFile As Long
bom(0) = 239: bom(1) = 187: bom(2) = 191
' Writing something as utf-8
UTF16toUTF8 "L'élève de l'école"
someFile = FreeFile()
Open "MacDisk:test.txt" For Binary As #someFile
' first, the BOM
Put #someFile, 1, bom
' then the utf-8 text
Put #someFile, 4, byteArray1
Close #someFile
End Sub
Sub UTF16toUTF8(theString As String)
' by Yves Champollion
' Transforms a VB/VBA string (they're all 16-bit) into a byteArray1, utf-8 compliant
If isStringUTF8(theString) Then Exit Sub
Dim iLoop As Long, i As Long, k As Long
k = 0
ReDim byteArray1(Len(theString) * 4)
For iLoop = 1 To Len(theString)
i = AscW(Mid$(theString, iLoop, 1))
If i < 0 Then i = i + 65536
If i > -1 And i < 128 Then
byteArray1(k) = i
k = k + 1
ElseIf i >= 128 And i < 2048 Then
byteArray1(k) = (i \ 64) Or 192
byteArray1(k + 1) = (i And 63) Or 128
k = k + 2
ElseIf i >= 2048 And i < 65536 Then
byteArray1(k) = (i \ 4096) Or 224
byteArray1(k + 1) = ((i \ 64) And 63) Or 128
byteArray1(k + 2) = (i And 63) Or 128
k = k + 3
Else
byteArray1(k) = (i \ 262144) Or 240
byteArray1(k + 1) = (((i \ 4096) And 63)) Or 128
byteArray1(k + 2) = ((i \ 64) And 63) Or 128
byteArray1(k + 3) = (i And 63) Or 128
k = k + 4
End If
Next
ReDim Preserve byteArray1(k - 1)
End Sub
Function isStringUTF8(theString As String) As Boolean
Dim i As Integer, j As Integer, k As Integer
' Prime the regex argument
If Len(regexUTF8) <> 66 Then
regexUTF8 = "*[" + Space$(62) + "]*"
For i = 192 To 253
Mid(regexUTF8, i - 189, 1) = Chr(i)
Next
End If
' First quick check: any escaping characters?
If Not theString Like regexUTF8 Then Exit Function
'longer check: are escaping characters followed by UTF-8 sequences?
For i = 1 To Len(theString) - 3
If Asc(Mid(theString, i, 1)) > 192 Then
k = Asc(Mid(theString, i, 1))
If k > 193 And k < 220 Then
If (Asc(Mid(theString, i + 1, 1)) And 128) Then
isStringUTF8 = True
Exit Function
End If
End If
If k > 223 Then
If (Asc(Mid(theString, i + 1, 1)) And 128) And (Asc(Mid(theString, i + 2, 1)) And 128) Then
isStringUTF8 = True
Exit Function
End If
End If
j = j + 1
If j > 100 Then Exit For
End If
Next
End Function

Related

Double Escaping Escapes - Working with Bytes

I've got the need to escape any possible ascii escapes in a file. I've written this, and thought it was working well but just noticed that for some reason, there is a bunch of extra bytes at the end of the file now. There is probably a better way to do this, so here I am :) What's the best way to find bytes, and add a byte next to it?
Dim imageData() As Byte = File.ReadAllBytes(f_imagePath)
'Escape any ascii escapes
For i As Int32 = 0 To imageData.Length
If imageData(i) = &H1B Then
ReDim Preserve imageData(imageData.Length + 1)
'shift entire array
Dim arrCopy(imageData.Length + 1) As Byte
Array.Copy(imageData, 0, arrCopy, 0, i)
arrCopy(i) = &H1B
Array.Copy(imageData, i, arrCopy, i + 1, imageData.Length - i)
imageData = arrCopy
i = i + 1
End If
Next
Using a list...
Dim imageData() As Byte = File.ReadAllBytes(f_imagePath)
Dim newIMGData As New List(Of Byte)
'Escape any ascii escapes
For i As Int32 = 0 To imageData.Length
If imageData(i) = &H1B Then
'not sure about this,
newIMGData.Add(imageData(i)) 'add the &H1B
newIMGData.Add(&H0) 'add the other character
Else
newIMGData.Add(imageData(i))
End If
Next
imageData = newIMGData.ToArray

How to generate md5-hashes for large files with VBA?

I have the following functions to generate md5-hashes for files. The functions work great for small files, but crashes and generate Run-time error 7 - Out of memory when I try to hash files over ~250 MB (I don't actually know at which exact size it breaks, but files below 200 MB work fine).
I don't understand why it breaks at a certain size, so if anyone could shed some light on that I would appreciate it a lot.
Also, is there anything I can do to make the functions handle larger files? I intend to use the functions in a larger tool where I will need to generate hashes for files of unknown sizes. Most will be small enough for the current functions to work, but I will have to be able to handle large files as well.
I got my current functions from the most upvoted answer this post How to get the MD5 hex hash for a file using VBA?
Public Function FileToMD5Hex(ByVal strFileName As String) As String
Dim varEnc As Variant
Dim varBytes As Variant
Dim strOut As String
Dim intPos As Integer
Set varEnc = CreateObject("System.Security.Cryptography.MD5CryptoServiceProvider")
'Convert the string to a byte array and hash it
varBytes = GetFileBytes(strFileName)
varBytes = varEnc.ComputeHash_2((varBytes))
'Convert the byte array to a hex string
For intPos = 1 To LenB(varBytes)
strOut = strOut & LCase(Right("0" & Hex(AscB(MidB(varBytes, intPos, 1))), 2))
Next
FileToMD5Hex = strOut
Set varEnc = Nothing
End Function
Private Function GetFileBytes(ByVal strPath As String) As Byte()
Dim lngFileNum As Long
Dim bytRtnVal() As Byte
lngFileNum = FreeFile
'If file exists, get number of bytes
If LenB(Dir(strPath)) Then
Open strPath For Binary Access Read As lngFileNum
ReDim bytRtnVal(LOF(lngFileNum)) As Byte
Get lngFileNum, , bytRtnVal
Close lngFileNum
Else
MsgBox "Filen finns inte" & vbCrLf & "Avbryter", vbCritical, "Filen hittades inte"
Exit Function
End If
GetFileBytes = bytRtnVal
Erase bytRtnVal
End Function
Thank you
It looks like you reached the memory limit.
A better way would be to compute the MD5 of the file by block:
Public Function ComputeMD5(filepath As String) As String
Dim buffer() As Byte, svc As Object, hFile%, blockSize&, i&
blockSize = 2 ^ 16
' open the file '
If Len(Dir(filepath)) Then Else Err.Raise 5, , "file not found" & vbCr & filepath
hFile = FreeFile
Open filepath For Binary Access Read As hFile
' allocate buffer '
If LOF(hFile) < blockSize Then blockSize = ((LOF(hFile) + 1024) \ 1024) * 1024
ReDim buffer(0 To blockSize - 1)
' compute hash '
Set svc = CreateObject("System.Security.Cryptography.MD5CryptoServiceProvider")
For i = 1 To LOF(hFile) \ blockSize
Get hFile, , buffer
svc.TransformBlock buffer, 0, blockSize, buffer, 0
Next
Get hFile, , buffer
svc.TransformFinalBlock buffer, 0, LOF(hFile) Mod blockSize
buffer = svc.Hash
' cleanup '
svc.Clear
Close hFile
' convert to an hexa string '
ComputeMD5 = String$(32, "0")
For i = 0 To 15
Mid$(ComputeMD5, i + i + 2 + (buffer(i) > 15)) = Hex(buffer(i))
Next
End Function
This is an extension to FlorentB's answer, which worked brilliantly for me until my files surpassed the 2GB LOF() size limit.
I tried to adapt for getting file length by alternate means as follows:
Public Function ComputeMD5(filepath As String) As String
If Len(Dir(filepath)) Then Else Err.Raise 5, , "File not found." & vbCr & filepath
Dim blockSize As Long: blockSize = 2 ^ 20
Dim blockSize_f As Double
Dim buffer() As Byte
Dim fileLength As Variant
Dim hFile As Integer
Dim n_Reads As Long
Dim i As Long
Dim svc As Object: Set svc = CreateObject("System.Security.Cryptography.MD5CryptoServiceProvider")
fileLength = DecGetFileSize(filepath)
If fileLength < blockSize Then blockSize = ((fileLength + 1024) \ 1024) * 1024
ReDim buffer(0 To blockSize - 1)
n_Reads = fileLength / blockSize
blockSize_f = fileLength - (CDbl(blockSize) * n_Reads)
hFile = FreeFile
Open filepath For Binary Access Read As hFile
For i = 1 To n_Reads
Get hFile, i, buffer
svc.TransformBlock buffer, 0, blockSize, buffer, 0
Next i
Get hFile, i, buffer
svc.TransformFinalBlock buffer, 0, blockSize_f
buffer = svc.Hash
svc.Clear
Close hFile
ComputeMD5 = String$(32, "0")
For i = 0 To 15
Mid$(ComputeMD5, i + i + 2 + (buffer(i) > 15)) = Hex(buffer(i))
Next
End Function
Public Function DecGetFileSize(fname As String) As Variant
Dim fso As New FileSystemObject
Dim f: Set f = fso.GetFile(fname)
DecGetFileSize = CDec(f.Size)
Set f = Nothing
Set fso = Nothing
End Function
This all runs fine, returning a string, however that string does not equal the MD5 calculated using other tools on the same file.
I can't work out where the discrepancy is originating.
I've checked and double checked filelength, n_reads, blockSize and blockSize_f and I'm sure those values are all correct.
I had some trouble with the Get function, where if I didn't explicitly tell it the block number, it dies at block 2048.
Any ideas / pointers would be much appreciated.

VBA to load very large file in one go (no buffering)

I am experiencing an unexpected vb limitation on the string max size, as explained in this post:
VBA unexpected reach of string size limit
While I was expecting to be able to load files up to 2GB (2^31 char) using open path for binary and get function, I get an out of string space error when I try to load a string larger than 255,918,061 characters.
I managed to work around this issue buffering the input stream of get. The problem is that I need to load the file as an array of string by splitting the buffer on vbCrLf characters.
This requires then to build the array line by line. Moreover, since I cannot be sure whether the buffer is ending on a break line or not I need additional operations. This solution is Time and Memory consuming. Loading a file of 300MB with this code costs 900MB (!) use of memory by excel. Is there a better solution ?
Here bellow is my code:
Function Load_File(path As String) As Variant
Dim MyData As String, FNum As Integer
Dim LenRemainingBytes As Long
Dim BufferSizeCurrent As Long
Dim FileByLines() As String
Dim CuttedLine As Boolean
Dim tmpSplit() As String
Dim FinalSplit() As String
Dim NbOfLines As Long
Dim LastLine As String
Dim count As Long, i As Long
Const BufferSizeMax As Long = 100000
FNum = FreeFile()
Open path For Binary As #FNum
LenRemainingBytes = LOF(FNum)
NbOfLines = FileNbOfLines(path)
ReDim FinalSplit(NbOfLines)
CuttedLine = False
Do While LenRemainingBytes > 0
MyData = ""
If LenRemainingBytes > BufferSizeMax Then
BufferSizeCurrent = BufferSizeMax
Else
BufferSizeCurrent = LenRemainingBytes
End If
MyData = Space$(BufferSizeCurrent)
Get #FNum, , MyData
tmpSplit = Split(MyData, vbCrLf)
If CuttedLine Then
count = count - 1
tmpSplit(0) = LastLine & tmpSplit(0)
For i = 0 To UBound(tmpSplit)
If count > NbOfLines Then Exit For
FinalSplit(count) = tmpSplit(i)
count = count + 1
Next i
Else
For i = 0 To UBound(tmpSplit)
If count > NbOfLines Then Exit For
FinalSplit(count) = tmpSplit(i)
count = count + 1
Next i
End If
Erase tmpSplit
LastLine = Right(MyData, Len(MyData) - InStrRev(MyData, vbCrLf) - 1)
CuttedLine = Len(LastLine) > 1
LenRemainingBytes = LenRemainingBytes - BufferSizeCurrent
Loop
Close FNum
Load_File = FinalSplit
Erase FinalSplit
End Function
Where the function FileNbOfLines is efficiently returning the number of line break characters.
Edit:
My Needs are:
To look for a specific string within the file
To get a specific number of lines coming after this string
Here you go, not pretty but should give you the general concept:
Sub GetLines()
Const fileName As String = "C:\Users\bloggsj\desktop\testfile.txt"
Const wordToFind As String = "FindMe"
Dim lineStart As String
Dim lineCount As String
Dim linesAfterWord As Long
With CreateObject("WScript.Shell")
lineCount = .Exec("CMD /C FIND /V /C """" """ & fileName & """").StdOut.ReadAll
lineStart = Split(.Exec("CMD /C FIND /N """ & wordToFind & """ """ & fileName & """").StdOut.ReadAll, vbCrLf)(2)
End With
linesAfterWord = CLng(Trim(Mid(lineCount, InStrRev(lineCount, ":") + 1))) - CLng(Trim(Mid(lineStart, 2, InStr(lineStart, "]") - 2)))
Debug.Print linesAfterWord
End Sub
Uses CMD to count the number of lines, then find the line at which the word appears, then subtract one from the other to give you the amount of lines after the word has been found.
Answer: Yes, using ReadAll from FSO should do the job.
Best answer: Just avoid it !
My needs were:
Identify a specific string within the file
Extract a certain number of lines after this string
As far as you precisely know the exact amout of data you want to extract, and assuming this amount of data is below vba string size limit (!), here is what it does the job the faster.
Decrease of computation time is improved using binary comparison of strings. My code is as follows:
Function GetFileLines(path As String, str As String, NbOfLines As Long) As String()
Const BUFSIZE As Long = 100000
Dim StringFound As Boolean
Dim lfAnsi As String
Dim strAnsi As String
Dim F As Integer
Dim BytesLeft As Long
Dim Buffer() As Byte
Dim strBuffer As String
Dim BufferOverlap As String
Dim PrevPos As Long
Dim NextPos As Long
Dim LineCount As Long
Dim data As String
F = FreeFile(0)
strAnsi = StrConv(str, vbFromUnicode) 'Looked String
lfAnsi = StrConv(vbLf, vbFromUnicode) 'LineBreak character
Open path For Binary Access Read As #F
BytesLeft = LOF(F)
ReDim Buffer(BUFSIZE - 1)
'Overlapping buffer is 3/2 times the size of strBuffer
'(two bytes per character)
BufferOverlap = Space$(Int(3 * BUFSIZE / 4))
StringFound = False
Do Until BytesLeft = 0
If BytesLeft < BUFSIZE Then ReDim Buffer(BytesLeft - 1)
Get #F, , Buffer
strBuffer = Buffer 'Binary copy of bytes.
BytesLeft = BytesLeft - LenB(strBuffer)
Mid$(BufferOverlap, Int(BUFSIZE / 4) + 1) = strBuffer 'Overlapping Buffer
If Not StringFound Then 'Looking for the the string
PrevPos = InStrB(BufferOverlap, strAnsi) 'Position of the looked string within the buffer
StringFound = PrevPos <> 0
If StringFound Then strBuffer = BufferOverlap
End If
If StringFound Then 'When string is found, loop until NbOfLines
Do Until LineCount = NbOfLines
NextPos = InStrB(PrevPos, strBuffer, lfAnsi)
If NextPos = 0 And LineCount < NbOfLines Then 'Buffer end reached, NbOfLines not reached
'Adding end of buffer to data
data = data & Mid$(StrConv(strBuffer, vbUnicode), PrevPos)
PrevPos = 1
Exit Do
Else
'Adding New Line to data
data = data & Mid$(StrConv(strBuffer, vbUnicode), PrevPos, NextPos - PrevPos + 1)
End If
PrevPos = NextPos + 1
LineCount = LineCount + 1
If LineCount = NbOfLines Then Exit Do
Loop
End If
If LineCount = NbOfLines then Exit Do
Mid$(BufferOverlap, 1, Int(BUFSIZE / 4)) = Mid$(strBuffer, Int(BUFSIZE / 4))
Loop
Close F
GetFileLines = Split(data, vbCrLf)
End Function
To crunch even more computation time, it is highly advised to use fast string concatenation as explained here.
For instance the following function can be used:
Sub FastConcat(ByRef Dest As String, ByVal Source As String, ByRef ccOffset)
Dim L As Long, Buffer As Long
Buffer = 50000
L = Len(Source)
If (ccOffset + L) >= Len(Dest) Then
If L > Buffer Then
Dest = Dest & Space$(L)
Else
Dest = Dest & Space$(Buffer)
End If
End If
Mid$(Dest, ccOffset + 1, L) = Source
ccOffset = ccOffset + L
End Sub
And then use the function as follows:
NbOfChars = 0
Do until...
FastConcat MyString, AddedString, NbOfChars
Loop
MyString = Left$(MyString,NbOfChars)

Open/Read a binary file - access rights

I am trying to convert VB5 to .NET and cannot get a binary read to work. My VB.NET decode only reads the first two characters correctly.
The (VB5->VB.NET) encode is
' Open file
x = Rnd(-mKeyValue)
filenum = FreeFile()
Try
FileOpen(filenum, Filename, OpenMode.Binary)
Catch ex As IO.IOException
MsgBox(ex.ToString, MsgBoxStyle.Critical, "File opening error")
Exit Sub
End Try
' write data
filecontents = ""
For i = 1 To Len(stringdate)
charnum = Asc(Mid(stringdate, i, 1))
randomint = Int(256 * Rnd())
charnum = charnum Xor randomint
singlechar = Chr(charnum)
FilePut(filenum, singlechar, i)
filecontents = filecontents & singlechar
Next i
And the (VB5->VB.NET) decode is
x = Rnd(-mKeyValue)
filenum = FreeFile()
FileOpen(filenum, Filename, OpenMode.Binary)
For i = 1 To LOF(filenum)
'To VB.NET
FileGet(filenum, singlechar, i)
charnum = Asc(singlechar)
Debug.Print("VB5 singlechar = " & singlechar)
randomint = Int(256 * Rnd())
charnum = charnum Xor randomint
singlechar = Chr(charnum)
Next i
My VB.NET code which fails (cannot read the file correctly) is;
Using reader As New BinaryReader(File.Open(Filename, FileMode.Open))
' Loop through length of file.
Dim pos As Integer = 0
Dim length As Integer = reader.BaseStream.Length
While pos < length
' Read the integer.
singlechar = reader.ReadChar()
charnum = Asc(singlechar) 'singlechar is type Char
randomint = Int(256 * Rnd())
charnum = charnum Xor randomint
singlechar = Chr(charnum)
i += 1
End While
End Using
Can anyone help me with translation from VB5 to .NET?
In VB.Net everything is a bit shorter ;)
' get a string from an encrypted file file:
Dim b() As Byte = IO.File.ReadAllBytes("path")
For i = 0 To b.Length - 1
b(i) = b(i) Xor (256 * Rnd())
Next
Dim s As String = System.Text.Encoding.ASCII.GetString(b)
Why read byte by byte (no sense to read 'char' anyway, since you only want the 8bit ASCII code), when .Net can read it at once? Your file is not larger > 100 MB, I assume? Then after getting the array, you simply XOR each element with your "random" value. If you dont need to be compatible to old versions, you might better use Random. Or maybe even better ... USE REAL ENCRYPTION (in .Net it's built-in!)
' put a string into a file
Dim c() As Byte = System.Text.Encoding.ASCII.GetBytes("The String you want to store encrypted")
For i = 0 To c.Length - 1
c(i) = c(i) Xor (256 * Rnd())
Next
IO.File.WriteAllBytes("another path", c)
Same for "encrypting". Convert the string to an array of byte (=ASCII values), XOR it and then write it back in ONE operation.
See the dangers of Unicode:
Private Sub Button1_Click(sender As Object, e As EventArgs) Handles Button1.Click
' Beware of UNICODE ... !!!
Using sw As New FileStream("foo.foo", FileMode.OpenOrCreate, FileAccess.Write)
' with old VB you effectively wrote BYTE data
sw.Write({65, 192}, 0, 2)
End Using
Using br As New BinaryReader(File.Open("foo.foo", FileMode.Open, FileAccess.Read))
' You are telling. Net that you expect a CHAR, which is not ASCII, but UNICODE
Dim c As Char = br.ReadChar
Dim d As Char = br.ReadChar
Dim cc = Asc(c)
Dim dd = Asc(d)
Debug.Print("65 -> {0}, 192 -> {1}", cc, dd)
End Using
End Sub
The output is 65 -> 65, 192 -> 63

Read Number of lines in Large Text File VB6

I have text File of Size 230MB. I want to Count Number of Lines OF that File.
I tried "Scripting.FileSystemOblect" but it goes out Of memory.
Please Help.
Thanks.
Normal Windows line breaks are CRLF, so you can count the LFs and add 1 to the count in cases where the last line of your files doesn't have one after it.
In true VB (i.e. VB5, VB6, etc.) you can make use of the byte-oriented String operations to speed many tasks. If we can assume the text files contain ANSI then this is pretty fast:
Option Explicit
Private Sub Main()
Const BUFSIZE As Long = 100000
Dim T0 As Single
Dim LfAnsi As String
Dim F As Integer
Dim FileBytes As Long
Dim BytesLeft As Long
Dim Buffer() As Byte
Dim strBuffer As String
Dim BufPos As Long
Dim LineCount As Long
T0 = Timer()
LfAnsi = StrConv(vbLf, vbFromUnicode)
F = FreeFile(0)
Open "big.txt" For Binary Access Read As #F
FileBytes = LOF(F)
ReDim Buffer(BUFSIZE - 1)
BytesLeft = FileBytes
Do Until BytesLeft = 0
If BufPos = 0 Then
If BytesLeft < BUFSIZE Then ReDim Buffer(BytesLeft - 1)
Get #F, , Buffer
strBuffer = Buffer 'Binary copy of bytes.
BytesLeft = BytesLeft - LenB(strBuffer)
BufPos = 1
End If
Do Until BufPos = 0
BufPos = InStrB(BufPos, strBuffer, LfAnsi)
If BufPos > 0 Then
LineCount = LineCount + 1
BufPos = BufPos + 1
End If
Loop
Loop
Close #F
'Add 1 to LineCount if last line of your files do not
'have a trailing CrLf.
MsgBox "Counted " & Format$(LineCount, "#,##0") & " lines in" & vbNewLine _
& Format$(FileBytes, "#,##0") & " bytes of text." & vbNewLine _
& Format$(Timer() - T0, "0.0#") & " seconds."
End Sub
Given a 7,000,000 line file of 293MB it only takes 0.7 seconds here. But note that I had not rebooted to ensure that the file wasn't cached when I ran that test. Without caching (i.e. after a reboot) I'd expect it to take as long as 5 times that.
Converting to handle Unicode text files is fairly simple. Just replace the B-functions by the non-B equivalents, make sure you set BUFSIZE to a multiple of 2, and search for vbLf instead of an ANSI LF byte.
You can do it by reading each line into the same variable. There's no need to save all the lines:
dim s as string
dim n as integer
open "filename.txt" for input as 1
n = 0
do while not eof(1)
line input #1, s
n = n + 1
loop
This has not been tested, and it's been a while since I've done any VB6, but it should be close.
This takes about 6 seconds for me on a 480mb binary file with 1mil+ 0xD (vbcr)
Dim buff() As Byte
Dim hF As Integer
Dim i As Long, n As Long
hF = FreeFile(0)
Open "c:\windows\Installer\2f91fd.msp" For Binary Access Read As #hF
ReDim buff(LOF(hF) - 1)
Get #hF, , buff()
Close #hF
For i = 0 To UBound(buff)
If buff(i) = 13 Then n = n + 1
Next
MsgBox n