How to get the MD5 hex hash for a file using VBA? - vba

How can I get the MD5 hex hash for a file using VBA?
I need a version that works for a file.
Something as simple as this Python code:
import hashlib
def md5_for_file(fileLocation, block_size=2**20):
f = open(fileLocation)
md5 = hashlib.md5()
while True:
data = f.read(block_size)
if not data:
break
md5.update(data)
f.close()
return md5.hexdigest()
But in VBA.

An older question that could use a better answer. These functions are specifically for hashing files, not for hashing passwords. As a bonus, I'm including a function for SHA1. If you get rid of the type declarations these functions work in VBScript too except that the GetFileBytes function needs to be changed to use FileSystemObject (or possibly ADO Stream) as the Free File doesn't exist in VBScript.
Private Sub TestMD5()
Debug.Print FileToMD5Hex("C:\test.txt")
Debug.Print FileToSHA1Hex("C:\test.txt")
End Sub
Public Function FileToMD5Hex(sFileName As String) As String
Dim enc
Dim bytes
Dim outstr As String
Dim pos As Integer
Set enc = CreateObject("System.Security.Cryptography.MD5CryptoServiceProvider")
'Convert the string to a byte array and hash it
bytes = GetFileBytes(sFileName)
bytes = enc.ComputeHash_2((bytes))
'Convert the byte array to a hex string
For pos = 1 To LenB(bytes)
outstr = outstr & LCase(Right("0" & Hex(AscB(MidB(bytes, pos, 1))), 2))
Next
FileToMD5Hex = outstr
Set enc = Nothing
End Function
Public Function FileToSHA1Hex(sFileName As String) As String
Dim enc
Dim bytes
Dim outstr As String
Dim pos As Integer
Set enc = CreateObject("System.Security.Cryptography.SHA1CryptoServiceProvider")
'Convert the string to a byte array and hash it
bytes = GetFileBytes(sFileName)
bytes = enc.ComputeHash_2((bytes))
'Convert the byte array to a hex string
For pos = 1 To LenB(bytes)
outstr = outstr & LCase(Right("0" & Hex(AscB(MidB(bytes, pos, 1))), 2))
Next
FileToSHA1Hex = outstr 'Returns a 40 byte/character hex string
Set enc = Nothing
End Function
Private Function GetFileBytes(ByVal path As String) As Byte()
Dim lngFileNum As Long
Dim bytRtnVal() As Byte
lngFileNum = FreeFile
If LenB(Dir(path)) Then ''// Does file exist?
Open path For Binary Access Read As lngFileNum
ReDim bytRtnVal(LOF(lngFileNum) - 1&) As Byte
Get lngFileNum, , bytRtnVal
Close lngFileNum
Else
Err.Raise 53
End If
GetFileBytes = bytRtnVal
Erase bytRtnVal
End Function

This should do it:
Dim fileBytes() As Byte = File.ReadAllBytes(path:=fullPath)
Dim Md5 As New MD5CryptoServiceProvider()
Dim byteHash() As Byte = Md5.ComputeHash(fileBytes)
Return Convert.ToBase64String(byteHash)

Related

Convert base64 to image

I have the following 2 functions which give me the base64 of an image
Option Compare Database
Option Explicit
Function readBytes(strFile As String) As Variant
Const TypeBinary = 1
Dim inStream As Object
' ADODB stream object used
Set inStream = CreateObject("ADODB.Stream")
' open with no arguments makes the stream an empty container
inStream.Open
inStream.Type = TypeBinary
inStream.LoadFromFile strFile
readBytes = inStream.Read()
End Function
Function encodeBase64(arrBytes As Variant) As String
Dim DM As Object, EL As Object
Set DM = CreateObject("Microsoft.XMLDOM")
' Create temporary node with Base64 data type
Set EL = DM.createElement("tmp")
EL.dataType = "bin.base64"
' Set bytes, get encoded String
EL.nodeTypedValue = arrBytes
encodeBase64 = EL.Text
End Function
Sub test()
TestIt CurrentProject.Path & "\pic.jpg"
End Sub
Function TestIt(strFile As String) As String
Dim arrBytes As Variant, strRet As String
arrBytes = readBytes(strFile)
strRet = encodeBase64(arrBytes)
Dim s As String
Open CurrentProject.Path & "\pic_base64.txt" For Binary As #1
Put #1, 1, strRet
Close #1
End Function
I need to convert the following 2 functions to same VBA format as the functions above and add an additional test function which will take the base64 converted image string from the above function and write it back to the image pic.jpg
private function decodeBase64(base64)
dim DM, EL
Set DM = CreateObject("Microsoft.XMLDOM")
' Create temporary node with Base64 data type
Set EL = DM.createElement("tmp")
EL.DataType = "bin.base64"
' Set encoded String, get bytes
EL.Text = base64
decodeBase64 = EL.NodeTypedValue
end function
private Sub writeBytes(file, bytes)
Dim binaryStream
Set binaryStream = CreateObject("ADODB.Stream")
binaryStream.Type = TypeBinary
'Open the stream and write binary data
binaryStream.Open
binaryStream.Write bytes
'Save binary data to disk
binaryStream.SaveToFile file, ForWriting
End Sub

How to generate md5-hashes for large files with VBA?

I have the following functions to generate md5-hashes for files. The functions work great for small files, but crashes and generate Run-time error 7 - Out of memory when I try to hash files over ~250 MB (I don't actually know at which exact size it breaks, but files below 200 MB work fine).
I don't understand why it breaks at a certain size, so if anyone could shed some light on that I would appreciate it a lot.
Also, is there anything I can do to make the functions handle larger files? I intend to use the functions in a larger tool where I will need to generate hashes for files of unknown sizes. Most will be small enough for the current functions to work, but I will have to be able to handle large files as well.
I got my current functions from the most upvoted answer this post How to get the MD5 hex hash for a file using VBA?
Public Function FileToMD5Hex(ByVal strFileName As String) As String
Dim varEnc As Variant
Dim varBytes As Variant
Dim strOut As String
Dim intPos As Integer
Set varEnc = CreateObject("System.Security.Cryptography.MD5CryptoServiceProvider")
'Convert the string to a byte array and hash it
varBytes = GetFileBytes(strFileName)
varBytes = varEnc.ComputeHash_2((varBytes))
'Convert the byte array to a hex string
For intPos = 1 To LenB(varBytes)
strOut = strOut & LCase(Right("0" & Hex(AscB(MidB(varBytes, intPos, 1))), 2))
Next
FileToMD5Hex = strOut
Set varEnc = Nothing
End Function
Private Function GetFileBytes(ByVal strPath As String) As Byte()
Dim lngFileNum As Long
Dim bytRtnVal() As Byte
lngFileNum = FreeFile
'If file exists, get number of bytes
If LenB(Dir(strPath)) Then
Open strPath For Binary Access Read As lngFileNum
ReDim bytRtnVal(LOF(lngFileNum)) As Byte
Get lngFileNum, , bytRtnVal
Close lngFileNum
Else
MsgBox "Filen finns inte" & vbCrLf & "Avbryter", vbCritical, "Filen hittades inte"
Exit Function
End If
GetFileBytes = bytRtnVal
Erase bytRtnVal
End Function
Thank you
It looks like you reached the memory limit.
A better way would be to compute the MD5 of the file by block:
Public Function ComputeMD5(filepath As String) As String
Dim buffer() As Byte, svc As Object, hFile%, blockSize&, i&
blockSize = 2 ^ 16
' open the file '
If Len(Dir(filepath)) Then Else Err.Raise 5, , "file not found" & vbCr & filepath
hFile = FreeFile
Open filepath For Binary Access Read As hFile
' allocate buffer '
If LOF(hFile) < blockSize Then blockSize = ((LOF(hFile) + 1024) \ 1024) * 1024
ReDim buffer(0 To blockSize - 1)
' compute hash '
Set svc = CreateObject("System.Security.Cryptography.MD5CryptoServiceProvider")
For i = 1 To LOF(hFile) \ blockSize
Get hFile, , buffer
svc.TransformBlock buffer, 0, blockSize, buffer, 0
Next
Get hFile, , buffer
svc.TransformFinalBlock buffer, 0, LOF(hFile) Mod blockSize
buffer = svc.Hash
' cleanup '
svc.Clear
Close hFile
' convert to an hexa string '
ComputeMD5 = String$(32, "0")
For i = 0 To 15
Mid$(ComputeMD5, i + i + 2 + (buffer(i) > 15)) = Hex(buffer(i))
Next
End Function
This is an extension to FlorentB's answer, which worked brilliantly for me until my files surpassed the 2GB LOF() size limit.
I tried to adapt for getting file length by alternate means as follows:
Public Function ComputeMD5(filepath As String) As String
If Len(Dir(filepath)) Then Else Err.Raise 5, , "File not found." & vbCr & filepath
Dim blockSize As Long: blockSize = 2 ^ 20
Dim blockSize_f As Double
Dim buffer() As Byte
Dim fileLength As Variant
Dim hFile As Integer
Dim n_Reads As Long
Dim i As Long
Dim svc As Object: Set svc = CreateObject("System.Security.Cryptography.MD5CryptoServiceProvider")
fileLength = DecGetFileSize(filepath)
If fileLength < blockSize Then blockSize = ((fileLength + 1024) \ 1024) * 1024
ReDim buffer(0 To blockSize - 1)
n_Reads = fileLength / blockSize
blockSize_f = fileLength - (CDbl(blockSize) * n_Reads)
hFile = FreeFile
Open filepath For Binary Access Read As hFile
For i = 1 To n_Reads
Get hFile, i, buffer
svc.TransformBlock buffer, 0, blockSize, buffer, 0
Next i
Get hFile, i, buffer
svc.TransformFinalBlock buffer, 0, blockSize_f
buffer = svc.Hash
svc.Clear
Close hFile
ComputeMD5 = String$(32, "0")
For i = 0 To 15
Mid$(ComputeMD5, i + i + 2 + (buffer(i) > 15)) = Hex(buffer(i))
Next
End Function
Public Function DecGetFileSize(fname As String) As Variant
Dim fso As New FileSystemObject
Dim f: Set f = fso.GetFile(fname)
DecGetFileSize = CDec(f.Size)
Set f = Nothing
Set fso = Nothing
End Function
This all runs fine, returning a string, however that string does not equal the MD5 calculated using other tools on the same file.
I can't work out where the discrepancy is originating.
I've checked and double checked filelength, n_reads, blockSize and blockSize_f and I'm sure those values are all correct.
I had some trouble with the Get function, where if I didn't explicitly tell it the block number, it dies at block 2048.
Any ideas / pointers would be much appreciated.

VBA to load very large file in one go (no buffering)

I am experiencing an unexpected vb limitation on the string max size, as explained in this post:
VBA unexpected reach of string size limit
While I was expecting to be able to load files up to 2GB (2^31 char) using open path for binary and get function, I get an out of string space error when I try to load a string larger than 255,918,061 characters.
I managed to work around this issue buffering the input stream of get. The problem is that I need to load the file as an array of string by splitting the buffer on vbCrLf characters.
This requires then to build the array line by line. Moreover, since I cannot be sure whether the buffer is ending on a break line or not I need additional operations. This solution is Time and Memory consuming. Loading a file of 300MB with this code costs 900MB (!) use of memory by excel. Is there a better solution ?
Here bellow is my code:
Function Load_File(path As String) As Variant
Dim MyData As String, FNum As Integer
Dim LenRemainingBytes As Long
Dim BufferSizeCurrent As Long
Dim FileByLines() As String
Dim CuttedLine As Boolean
Dim tmpSplit() As String
Dim FinalSplit() As String
Dim NbOfLines As Long
Dim LastLine As String
Dim count As Long, i As Long
Const BufferSizeMax As Long = 100000
FNum = FreeFile()
Open path For Binary As #FNum
LenRemainingBytes = LOF(FNum)
NbOfLines = FileNbOfLines(path)
ReDim FinalSplit(NbOfLines)
CuttedLine = False
Do While LenRemainingBytes > 0
MyData = ""
If LenRemainingBytes > BufferSizeMax Then
BufferSizeCurrent = BufferSizeMax
Else
BufferSizeCurrent = LenRemainingBytes
End If
MyData = Space$(BufferSizeCurrent)
Get #FNum, , MyData
tmpSplit = Split(MyData, vbCrLf)
If CuttedLine Then
count = count - 1
tmpSplit(0) = LastLine & tmpSplit(0)
For i = 0 To UBound(tmpSplit)
If count > NbOfLines Then Exit For
FinalSplit(count) = tmpSplit(i)
count = count + 1
Next i
Else
For i = 0 To UBound(tmpSplit)
If count > NbOfLines Then Exit For
FinalSplit(count) = tmpSplit(i)
count = count + 1
Next i
End If
Erase tmpSplit
LastLine = Right(MyData, Len(MyData) - InStrRev(MyData, vbCrLf) - 1)
CuttedLine = Len(LastLine) > 1
LenRemainingBytes = LenRemainingBytes - BufferSizeCurrent
Loop
Close FNum
Load_File = FinalSplit
Erase FinalSplit
End Function
Where the function FileNbOfLines is efficiently returning the number of line break characters.
Edit:
My Needs are:
To look for a specific string within the file
To get a specific number of lines coming after this string
Here you go, not pretty but should give you the general concept:
Sub GetLines()
Const fileName As String = "C:\Users\bloggsj\desktop\testfile.txt"
Const wordToFind As String = "FindMe"
Dim lineStart As String
Dim lineCount As String
Dim linesAfterWord As Long
With CreateObject("WScript.Shell")
lineCount = .Exec("CMD /C FIND /V /C """" """ & fileName & """").StdOut.ReadAll
lineStart = Split(.Exec("CMD /C FIND /N """ & wordToFind & """ """ & fileName & """").StdOut.ReadAll, vbCrLf)(2)
End With
linesAfterWord = CLng(Trim(Mid(lineCount, InStrRev(lineCount, ":") + 1))) - CLng(Trim(Mid(lineStart, 2, InStr(lineStart, "]") - 2)))
Debug.Print linesAfterWord
End Sub
Uses CMD to count the number of lines, then find the line at which the word appears, then subtract one from the other to give you the amount of lines after the word has been found.
Answer: Yes, using ReadAll from FSO should do the job.
Best answer: Just avoid it !
My needs were:
Identify a specific string within the file
Extract a certain number of lines after this string
As far as you precisely know the exact amout of data you want to extract, and assuming this amount of data is below vba string size limit (!), here is what it does the job the faster.
Decrease of computation time is improved using binary comparison of strings. My code is as follows:
Function GetFileLines(path As String, str As String, NbOfLines As Long) As String()
Const BUFSIZE As Long = 100000
Dim StringFound As Boolean
Dim lfAnsi As String
Dim strAnsi As String
Dim F As Integer
Dim BytesLeft As Long
Dim Buffer() As Byte
Dim strBuffer As String
Dim BufferOverlap As String
Dim PrevPos As Long
Dim NextPos As Long
Dim LineCount As Long
Dim data As String
F = FreeFile(0)
strAnsi = StrConv(str, vbFromUnicode) 'Looked String
lfAnsi = StrConv(vbLf, vbFromUnicode) 'LineBreak character
Open path For Binary Access Read As #F
BytesLeft = LOF(F)
ReDim Buffer(BUFSIZE - 1)
'Overlapping buffer is 3/2 times the size of strBuffer
'(two bytes per character)
BufferOverlap = Space$(Int(3 * BUFSIZE / 4))
StringFound = False
Do Until BytesLeft = 0
If BytesLeft < BUFSIZE Then ReDim Buffer(BytesLeft - 1)
Get #F, , Buffer
strBuffer = Buffer 'Binary copy of bytes.
BytesLeft = BytesLeft - LenB(strBuffer)
Mid$(BufferOverlap, Int(BUFSIZE / 4) + 1) = strBuffer 'Overlapping Buffer
If Not StringFound Then 'Looking for the the string
PrevPos = InStrB(BufferOverlap, strAnsi) 'Position of the looked string within the buffer
StringFound = PrevPos <> 0
If StringFound Then strBuffer = BufferOverlap
End If
If StringFound Then 'When string is found, loop until NbOfLines
Do Until LineCount = NbOfLines
NextPos = InStrB(PrevPos, strBuffer, lfAnsi)
If NextPos = 0 And LineCount < NbOfLines Then 'Buffer end reached, NbOfLines not reached
'Adding end of buffer to data
data = data & Mid$(StrConv(strBuffer, vbUnicode), PrevPos)
PrevPos = 1
Exit Do
Else
'Adding New Line to data
data = data & Mid$(StrConv(strBuffer, vbUnicode), PrevPos, NextPos - PrevPos + 1)
End If
PrevPos = NextPos + 1
LineCount = LineCount + 1
If LineCount = NbOfLines Then Exit Do
Loop
End If
If LineCount = NbOfLines then Exit Do
Mid$(BufferOverlap, 1, Int(BUFSIZE / 4)) = Mid$(strBuffer, Int(BUFSIZE / 4))
Loop
Close F
GetFileLines = Split(data, vbCrLf)
End Function
To crunch even more computation time, it is highly advised to use fast string concatenation as explained here.
For instance the following function can be used:
Sub FastConcat(ByRef Dest As String, ByVal Source As String, ByRef ccOffset)
Dim L As Long, Buffer As Long
Buffer = 50000
L = Len(Source)
If (ccOffset + L) >= Len(Dest) Then
If L > Buffer Then
Dest = Dest & Space$(L)
Else
Dest = Dest & Space$(Buffer)
End If
End If
Mid$(Dest, ccOffset + 1, L) = Source
ccOffset = ccOffset + L
End Sub
And then use the function as follows:
NbOfChars = 0
Do until...
FastConcat MyString, AddedString, NbOfChars
Loop
MyString = Left$(MyString,NbOfChars)

How do I get an Ascii to warp to a certain value after it has past 122?

I am trying to write an encryption program. The problem I am facing is that I am converting the text to ascii and then adding on the offset. However when it goes past the letter 'z' I want it to warp back to 'a' and go from there.
Sub enc()
Text = TextBox1.Text
finalmessage = ""
letters = Text.ToCharArray
offset = ComboBox1.SelectedItem
For x = LBound(letters) To UBound(letters)
finalmessage = finalmessage + Chr(Asc(letters(x)) + offset)
Next
TextBox2.Text = finalmessage
End Sub
I guess to make it easy to decode afterwards, you should to it somewhat in the line of base64 encoding, first encoding everything to a normalized binary string, then encode in the range you want (since using binary, it has to be something that fits with 2^X).
To match your range, i used a baseset of 32, and a simple encoding decoding example (a bit more verbose that it should be, perhaps)
Module Module1
Dim encodeChars As String = "abcdefghijklmnopqrstuvwxyzABCDEF" ' use 32 as a base
Function Encode(text As String) As String
Dim bitEncoded As String = ""
Dim outputMessage As String = ""
For Each ch As Char In text.ToCharArray()
Dim i As Integer = Convert.ToByte(ch)
bitEncoded &= Convert.ToString(i, 2).PadLeft(8, "0"c)
Next
While bitEncoded.Length Mod 5 <> 0
bitEncoded &= "0"
End While
For position As Integer = 0 To bitEncoded.Length - 1 Step 5
Dim range As String = bitEncoded.Substring(position, 5)
Dim index As Integer = Convert.ToInt32(range, 2)
outputMessage &= encodeChars(index).ToString()
Next
Return outputMessage
End Function
Function Decode(encodedText As String) As String
Dim bitEncoded As String = ""
Dim outputMessage As String = ""
For Each ch In encodedText
Dim index As Integer = encodeChars.IndexOf(ch)
If index < 0 Then
Throw New FormatException("Invalid character in encodedText!")
End If
bitEncoded &= Convert.ToString(index, 2).PadLeft(5, "0"c)
Next
' strip the extra 0's
While bitEncoded.Length Mod 8 <> 0
bitEncoded = bitEncoded.Substring(0, bitEncoded.Length - 1)
End While
For position As Integer = 0 To bitEncoded.Length - 1 Step 8
Dim range As String = bitEncoded.Substring(position, 8)
Dim index As Integer = Convert.ToInt32(range, 2)
outputMessage &= Chr(index).ToString()
Next
Return outputMessage
End Function
Sub Main()
Dim textToEncode As String = "This is a small test, with some special characters! Just testing..."
Dim encodedText As String = Encode(textToEncode)
Dim decodedText As String = Decode(encodedText)
Console.WriteLine(textToEncode)
Console.WriteLine(encodedText)
Console.WriteLine(decodedText)
If Not String.Equals(decodedText, textToEncode) Then
Console.WriteLine("Encoding / decoding failed!")
Else
Console.WriteLine("Encoding / decoding completed succesfully!")
End If
Console.ReadLine()
End Sub
End Module
this then gives the following output?
This is a small test, with some special characters! Just testing...
krugsCzanfzsayjaonwwcBdmebAgkCBufqqhoAlunaqhgBBnmuqhgCdfmnuwcBbamnugcCtbmnAgkCtteeqeuDltoqqhizltoruwCzzofyxa
This is a small test, with some special characters! Just testing...
Encoding / decoding completed succesfully!

Open/Read a binary file - access rights

I am trying to convert VB5 to .NET and cannot get a binary read to work. My VB.NET decode only reads the first two characters correctly.
The (VB5->VB.NET) encode is
' Open file
x = Rnd(-mKeyValue)
filenum = FreeFile()
Try
FileOpen(filenum, Filename, OpenMode.Binary)
Catch ex As IO.IOException
MsgBox(ex.ToString, MsgBoxStyle.Critical, "File opening error")
Exit Sub
End Try
' write data
filecontents = ""
For i = 1 To Len(stringdate)
charnum = Asc(Mid(stringdate, i, 1))
randomint = Int(256 * Rnd())
charnum = charnum Xor randomint
singlechar = Chr(charnum)
FilePut(filenum, singlechar, i)
filecontents = filecontents & singlechar
Next i
And the (VB5->VB.NET) decode is
x = Rnd(-mKeyValue)
filenum = FreeFile()
FileOpen(filenum, Filename, OpenMode.Binary)
For i = 1 To LOF(filenum)
'To VB.NET
FileGet(filenum, singlechar, i)
charnum = Asc(singlechar)
Debug.Print("VB5 singlechar = " & singlechar)
randomint = Int(256 * Rnd())
charnum = charnum Xor randomint
singlechar = Chr(charnum)
Next i
My VB.NET code which fails (cannot read the file correctly) is;
Using reader As New BinaryReader(File.Open(Filename, FileMode.Open))
' Loop through length of file.
Dim pos As Integer = 0
Dim length As Integer = reader.BaseStream.Length
While pos < length
' Read the integer.
singlechar = reader.ReadChar()
charnum = Asc(singlechar) 'singlechar is type Char
randomint = Int(256 * Rnd())
charnum = charnum Xor randomint
singlechar = Chr(charnum)
i += 1
End While
End Using
Can anyone help me with translation from VB5 to .NET?
In VB.Net everything is a bit shorter ;)
' get a string from an encrypted file file:
Dim b() As Byte = IO.File.ReadAllBytes("path")
For i = 0 To b.Length - 1
b(i) = b(i) Xor (256 * Rnd())
Next
Dim s As String = System.Text.Encoding.ASCII.GetString(b)
Why read byte by byte (no sense to read 'char' anyway, since you only want the 8bit ASCII code), when .Net can read it at once? Your file is not larger > 100 MB, I assume? Then after getting the array, you simply XOR each element with your "random" value. If you dont need to be compatible to old versions, you might better use Random. Or maybe even better ... USE REAL ENCRYPTION (in .Net it's built-in!)
' put a string into a file
Dim c() As Byte = System.Text.Encoding.ASCII.GetBytes("The String you want to store encrypted")
For i = 0 To c.Length - 1
c(i) = c(i) Xor (256 * Rnd())
Next
IO.File.WriteAllBytes("another path", c)
Same for "encrypting". Convert the string to an array of byte (=ASCII values), XOR it and then write it back in ONE operation.
See the dangers of Unicode:
Private Sub Button1_Click(sender As Object, e As EventArgs) Handles Button1.Click
' Beware of UNICODE ... !!!
Using sw As New FileStream("foo.foo", FileMode.OpenOrCreate, FileAccess.Write)
' with old VB you effectively wrote BYTE data
sw.Write({65, 192}, 0, 2)
End Using
Using br As New BinaryReader(File.Open("foo.foo", FileMode.Open, FileAccess.Read))
' You are telling. Net that you expect a CHAR, which is not ASCII, but UNICODE
Dim c As Char = br.ReadChar
Dim d As Char = br.ReadChar
Dim cc = Asc(c)
Dim dd = Asc(d)
Debug.Print("65 -> {0}, 192 -> {1}", cc, dd)
End Using
End Sub
The output is 65 -> 65, 192 -> 63