Releasing memory held in objects in VB.NET - vb.net

I am using a 3rd party DLL from pdf-tools to parse PDF files inside my VB.NET application and write the extracted data to a SQL localDB database. I'm giving the user two options: to select a PDF file for parsing or to point to a folder and the application will loop through all PDF files inside it. Both options call the same procedure doPDFFile() as below.
The problem is: if I import a number of files individually by selecting a file every time, the program runs fine. However, If I select a folder that contains the same files, the memory used by the program will keep growing further and further. In Windows task manager, it can reach to 1 GB after importing around 30 files.
I used ANTS memory profiler from redgate, and it showed that one object called "GraphicsState" which is part of the pdf-tools object is growing too big when looping inside a folder. This does not happen if I select the same files one by one. Besides the memory problem, the application becomes very slow after parsing some files. My questions is: why is this happening? and how to prevent it? The user should be able to point the program to a folder with hundreds of PDF files, how can I achieve this?
Below is a snapshot of the code:
'When user selects one file
Private Sub OpenToolStripMenuItem_Click(...)
OpenFileDialog1.FileName.ToString
doPDFFile()
End Sub
'When user selects a folder
Private Sub LoopToolStripMenuItem_Click() Handles LoopToolStripMenuItem.Click
FolderBrowserDialog1.ShowDialog()
sPath = FolderBrowserDialog1.SelectedPath
For Each fileName As String In IO.Directory.GetFiles(...)
sPath = fileName
doPDFFile()
Next
End Sub
Inside the doPDFFile() procedure, I'm doing the following, I'm using the document object from pdf-tools and I'm passing it byRef to another procedure:
Public Sub doPDFFile()
Using document As New Pdftools.PdfExtract.Document
document.open(sPath)
findFirstPage(document) 'passing by reference
ParseFirstPage(document) 'passing by reference
'storing the parsed text in an array
'.......
do
'extracting the colors from the graphicsStateObject inside the document object:
Using objGraphicsState As Pdftools.PdfExtract.GraphicsState = content.GraphicsState
sColor = objGraphicsState.FillColorRGB
End Using
'save text and color in an array of objects
until endOfText
end using
end sub

Related

Files not opening as read-only in VB.NET when read-only attribute set

I'm having trouble with a bit of code designed to intentionally open files as read-only. My team often needs to be able to peek into each others' files without locking the file owner out, so in a form being designed for document management I want users to be able to open files optionally as read-only.
Coming from VBA I'm still somewhat new to VB.NET and also bitwise operations generally, but I believe the "read-only" interpretation of this code from MS Docs has been correctly implemented:
Dim attributes As FileAttributes
attributes = File.GetAttributes(path)
If Not (attributes And FileAttributes.ReadOnly) = FileAttributes.ReadOnly Then
' Make file readonly.
File.SetAttributes(path, File.GetAttributes(path) Or FileAttributes.ReadOnly)
End If
' Open the file
System.Diagnostics.Process.Start(path)
' Reset the file to read/write.
attributes = RemoveAttribute(attributes, FileAttributes.ReadOnly)
File.SetAttributes(path, attributes)
When I use "GetAttributes" before and after the line to open the file I get a return of 1 or sometimes as 33, which the FileAttributes enumeration documentation suggests is correct for what I'm trying to do. Before and after the attribute change "GetAttributes" returns 128 or in certain cases 32, which also should be correct.
However despite the fact the above code appears correctly implemented and seems to be producing the correct affect in the file's attributes, files opened this way (namely Excel files) open as read-write. I'm also fine with other ways of opening a file read-only provided that it can be used equally well on any document you would commonly encounter in an office setting (Excel, Word, etc.) with its default program. That being said, I've tried several methods and haven't had any success, and this one by far has seemed the cleanest and most promising.
Thanks in advance!
As described in comments, the file attributes are restored to their
previous state right after the Process.Start() command: the
application that opens the file has not been started yet; when it
finally access the file, the read-ony attribute has already been
removed.
A possible solution is to subscribe to the Process.Exited event and restore the original file attributes when the Process termination is notified.
A modified version of your code: the EnableRaisingEvents property causes a process to raise the Process.Exited event. I subscribed to the event using an in-line delegate (a Lambda), but I added an example that uses a standard delegate method using the AddressOf operator (since you said you have to learn about events).
Since we want to run a file and not an executable, we need to also set UseShellExecute = True, so the Shell will find and execute the registered application associated with the file extension.
If UseShellExecute = True is not specified, an exception is raised (the file is not an executable).
The name of the file to execute is assigned to the Process.StartInfo.FileName
When the Process terminates, the Exited event is raised. In the event handler, the file attributes are restored to the previous state.
Private Sub SomeMethod(filePath As String)
' filePath is File's Full path
Dim attributes As FileAttributes = File.GetAttributes(filePath)
File.SetAttributes(filePath, (attributes) Or FileAttributes.ReadOnly)
Dim proc As Process = New Process()
proc.StartInfo.FileName = filePath
proc.StartInfo.UseShellExecute = True
proc.EnableRaisingEvents = True
AddHandler proc.Exited,
Sub()
File.SetAttributes(filePath, attributes)
proc?.Dispose()
End Sub
proc.Start()
End Sub
If you want to use a standard method as the Exited event handler, you have to declare the filePath and attributes variables in a different scope. Neither can be a local variable, they won't be accessible from the method delegate.
If you need to run just one file, these can be instance fields (declared in the scope of the current class).
If you instead can have multiple processes running different files, all waiting for the associated applicationt to terminate, these informations should to be stored in a list of objects, a Dictionary or a similar container.
For example, using a Dictionary, declared as a Field:
(the Dictionary Key is the File path. If a file can be opened multiple times - a .txt file maybe, use a different identifier)
Private myRunningFiles As New Dictionary(Of String, FileAttributes)
' (...)
Private Sub SomeMethod(filePath As String)
Dim attributes As FileAttributes = File.GetAttributes(filePath)
If Not myRunningFiles.ContainsKey(filePath) Then
myRunningFiles.Add(filePath, attributes)
Else
' Notify that the file is already opened
Return
End If
Dim proc As Process = New Process()
' (... same ...)
AddHandler proc.Exited, AddressOf OnProcessExited
End Sub
Protected Sub OnProcessExited(sender As Object, e As EventArgs)
Dim proc = DirectCast(sender, Process)
Dim filePath = proc.StartInfo.FileName
Dim attributes = myRunningFiles(filePath)
File.SetAttributes(filePath, attributes)
myRunningFiles.Remove(filePath)
proc?.Dispose()
End Sub
Thanks to #Jimi I was able to come up with the solution. In my application Excel files are the most important to open in ReadOnly so I'm happy with an Excel-only solution for the time being. While his answer for using and releasing attributes was great it had the problem of not restoring the attributes to their default until the file is closed, which to my understanding would cause others to also open the file ReadOnly while the file is open. The point of this in my application is to "peek" at a file without locking other users on a network out of ReadWrite access, so unfortunately his-well-thought out solution won't work.
I was however able to use the /r switch with a bit of investigating. It was a bit tricky (for someone of my skill level) since some switches need to be placed before the file path and some after. Solution below:
Process.Start("EXCEL.exe", "/r " & Chr(34) & path & Chr(34))

Update an excel file by multiple users at same time without opening the file

Scenario
I have an excel file that contains data. There are multiple users accessing the file at the same time.
Problem
There will be problem if multiple users tried to input data to that excel file at the same time due to only one user is allowed to open the file at one time
Question
Is there any way whereby I can update the excel file (Eg: add a value to a cell, delete a value from a cell, find a particular cell etc) without opening it so that multiple users can update it at the same time using excel VBA?
I went to the direction of using shared files. But later found out to be excel shared files are very buggy. If use shared file, excel/macro can be very slow, intermittent crashes and sometime the whole file may get corrupted and could not be opened or repaired afterwards. Also depends on how many users use the file, the file size can grow quite big. So it is best not to use shared workbook. Totally not worth trying. Instead if need multiple users to update data simultaneously, it is better to use some database such as MSAccess, MSSql (Update MSSQL from Excel) etc with excel. For my situation since number of users are less, I didn't use any database, instead put a prompt for the user to wait until the other user close that file. Please see the codes below to check if a file is opened and if so, to prompt user. I got this code from stack overflow itself and I modified to suit my needs.
Call the module TestFileOpened as below.
Sub fileCheck()
Call TestFileOpened(plannerFilePathTemp)
If fileInUse = True Then
Application.ScreenUpdating = True
Exit Sub
End If
End Sub
Here plannerFilePathTemp is the temporary file location of your original file. Whenever an excel file opened, a temp file will be created. For example, your original file location is as below
plannerFilePath = "C:\TEMP\XXX\xxx.xlsx"
Thus your temporary file location will be
plannerFilePathTemp = "C:\TEMP\XXX\~$xxx.xlsx"
or in other words, temporary file name will be ~$xxx.xlsx
The following codes will be called upon Call TestFileOpened(plannerFilePathTemp)
Public fileInUse As Boolean
Sub TestFileOpened(fileOpenedOrNot As String)
Dim Folder As String
Dim FName As String
Set objFSO = CreateObject("Scripting.FileSystemObject")
If objFSO.FileExists(fileOpenedOrNot) Then
fileInUse = True
MsgBox "Database is opened and using by " & GetFileOwner(fileOpenedOrNot) & ". Please wait a few second and click again", vbInformation, "Database in Use"
Else
fileInUse = False
End If
End Sub
Function GetFileOwner(strFileName)
Set objWMIService = GetObject("winmgmts:")
Set objFileSecuritySettings = _
objWMIService.Get("Win32_LogicalFileSecuritySetting='" & strFileName & "'")
intRetVal = objFileSecuritySettings.GetSecurityDescriptor(objSD)
If intRetVal = 0 Then
GetFileOwner = objSD.Owner.Name
Else
GetFileOwner = "Unknown"
End If
End Function
I encountered Out of memory issues also when used shared files. So during the process, I figured out the following methods to minimize memory consumption
Some tips to clear memory

Access not finding an excel file which was created in vba on a network

I have code that opens & alters an excel table, saves it to a new (network) location, and then imports data from the newly formatted excel. The issue I have is that the code can't find the newly created file, saying it doesn't exist. I've played around with adding a 'pause' function, but I'm wondering if there's a way to refresh the network link in vba? Or is there a better method? Files will vary by size.
If I add a break in the code and let it sit for a few minutes it finds the file fine.
Private Sub btnImport_Click()
[other code]
'save excel file onto network location
xlApp.ActiveWorkbook.SaveAs ("\\network\file.xlsx")
xlApp.Quit
'Import file to temp table.
DoCmd.TransferSpreadsheet acImport,acSpreadsheetTypeExcel9, "tblImport","\\network\file.xlsx", True
End sub
Run-time error '3011': The Microsoft Access database engine could not find the object '\network\file.xlsx'. Make sure the object exists and that you spell its name and the path name correctly.
Free up the Excel resources when finished:
xlApp.Quit
'free any other Excel resources, then..
Set xlApp = Nothing
If it is a network issue and you need a delay then you can use Application.OnTime. For 15 seconds:
Application.OnTime Now + TimeValue("00:00:15"), "my_Procedure"
This will run your procedure (my_Procedure()) after the delay, and this procedure can perform the import.

How do I extract from a password protected zip file using VBA?

I have been struggling with this problem for quite some time now, and have trawled through various forums trying to find an answer.
I have a mailbox which received 5 emails each morning which have zipped, password-protected .csv files which require opening and processing in Excel. I am semi-competent in VBA so the Excel side of things is no issue, but it would save me a lot of time if I could automatically unzip these files and save them to a folder.
These emails have the same subjects, attachments and passwords each day, and are from the same sender to the same mailbox. I have code which can automatically process and save the .zip files to a location, but I am still stuck with the problem of having to go into each one, enter the password, open the file, save it, etc.
I have refrained from attaching this code because I would like to see if there are better solutions to my one, and it is the unzipping and password entering that I really need help with. This being said if you would like me to attach my code I will be happy to do so :)
To expand my comment, no input is required - the password is obtained via an event so can be provided in code.
Download the demo project and DLL
In the outlook VBA editor, right click the Project in the tree, Import File and add
cUnzip.cls
mUnzip.bas
Put the DLL in \System32 (or \SysWoW64 on 64bit)
As a naive example, add a new class cWhatever:
Private WithEvents Unzip As cUnzip
Private Sub Class_Initialize()
Set Unzip = New cUnzip
End Sub
Sub RunUnzip()
With Unzip
.ZipFile = "c:\blah\qqq.zip"
.UnzipFolder = "c:\temp"
.Unzip
End With
End Sub
Private Sub Unzip_PasswordRequest(sPassword As String, bCancel As Boolean)
sPassword = "password123"
End Sub
Then to run from elsewhere:
Dim x As New cWhatever
x.RunUnzip
First time you run there is an error highlighting App.EXEName, just delete the App.EXEName &

Let VB Form prepare working environment in chosen directory

I am working on a GUI for a simulation program. The simulation program is a single .exe which is driven by an input file (File.inp placed in the same directory).
The Original.inp file functions as a template from which the form reads all the values into an array. Then it changes these values reflecting the changes done by the user in the form. After that it writes all the new values to File.inp.
By pushing the "Run" button the Simulation.exe file is executed.
The folder structure looks like this:
root
|
|---input
| |
| |--Original.inp
|
|---GUI.exe
|---Simulation.exe
|---File.inp
Ideally I would supply only the GUI, the user would select the working directory and then the GUI.exe would create an input folder and extract the Original.inp and Simulation.exe in the appropriate locations. So far I have only managed to include Original.inp and Simulation.exe as "EmbeddedResources" in my VB project and I have let my code create an input folder in the working directory chosen by the user.
Can someone please explain to me how I can extract the .inp and .exe file into the correct directories? I've searched on google, tried File.WriteAllBytes and Filestream.WriteByte but did not get the desired results.
The problem with File.WriteAllBytes was that I could not point to the embedded resource ("Simulation.exe is not a member of Resources" and with Filestream.WriteByte I got a 0 kb file.
The question commenters are correct, this is probably a task best left for a setup program. However, that having been stated, in the interest of answering the question as asked I offer the following approach.
Contrary to your supposition in your question comment, you do need to "read" the embedded resource from the GUI's executable file, since it's an embedded resource and not an external resource. It won't magically extract itselt from the executable file. You need to do the manual read from the assembly and write to your specified locations. To do this, you need to read the resource using .Net Reflection, via the currently executing assembly's GetManifestResourceStream method.
The Simulation.exe file is a binary file so it must be handled as such. I assumed that the Orginal.inp file was a text file since it afforded the opportunity to demonstrate different types of file reads and writes. Any error handling (and there should be plenty) is omitted for brevity.
The code could look something like this:
Imports System.IO
Imports System.Reflection
Module Module1
Sub Main()
'Determine where the GUI executable is located and save for later use
Dim thisAssembly As Assembly = Assembly.GetExecutingAssembly()
Dim appFolder As String = Path.GetDirectoryName(thisAssembly.Location)
Dim fileContents As String = String.Empty
'Read the contents of the template file. It was assumed this is in text format so a
'StreamReader, adept at reading text files, was used to read the entire file into a string
'N.B. The namespace that prefixes the file name in the next line is CRITICAL. An embedded resource
'is placed in the executable with the namespace noted in the project file, so it must be
'dereferenced in the same manner.
Using fileStream As Stream = thisAssembly.GetManifestResourceStream("SOQuestion10613051.Original.inp")
If fileStream IsNot Nothing Then
Using textStreamReader As New StreamReader(fileStream)
fileContents = textStreamReader.ReadToEnd()
textStreamReader.Close()
End Using
fileStream.Close()
End If
End Using
'Create the "input" subfolder if it doesn't already exist
Dim inputFolder As String = Path.Combine(appFolder, "input")
If Not Directory.Exists(inputFolder) Then
Directory.CreateDirectory(inputFolder)
End If
'Write the contents of the resource read above to the input sub-folder
Using writer As New StreamWriter(Path.Combine(inputFolder, "Original.inp"))
writer.Write(fileContents)
writer.Close()
End Using
'Now read the simulation executable. The same namespace issues noted above still apply.
'Since this is a binary file we use a file stream to read into a byte buffer
Dim buffer() As Byte = Nothing
Using fileStream As Stream = thisAssembly.GetManifestResourceStream("SOQuestion10613051.Simulation.exe")
If fileStream IsNot Nothing Then
ReDim buffer(fileStream.Length)
fileStream.Read(buffer, 0, fileStream.Length)
fileStream.Close()
End If
End Using
'Now write the byte buffer with the contents of the executable file to the root folder
If buffer IsNot Nothing Then
Using exeStream As New FileStream(Path.Combine(appFolder, "Simulation.exe"), FileMode.Create, FileAccess.Write, FileShare.None)
exeStream.Write(buffer, 0, buffer.Length)
exeStream.Close()
End Using
End If
End Sub
End Module
You will also have to add logic to determine if the files have been extracted so it doesn't happen every time the GUI is invoked. That's a big reason why an installation program might be the correct answer.