VBA - Loop through multiple subfolders on a network location with date search criteria/ Improve search speed - vba

Purpose of my question and of the VBA code:
Get specific data (a couple columns) from each one of the "table.csv" file in a network directory. Each networkdirectory/subfolders01/subfolders02 contains one "table.csv" file but 100 other subfolders are included in each network/subfolders01. The other folders are not needed, the only one we are interested in is subfolder02 for each subfolder01. The number of subfolders01 in the network directory is about 15000. However I only need subfolders01 from Jan2020 to Apr2020,for example (200 subfolders).
Final purpose is to trend data.
Issue:
I am trying to understand how I could improve the VBA code that I am currently using.
This code goes through each subfolder one by one and then check the date and file name.
I am wondering if there is a way to add any search filters criteria for subfolder date and name to have a faster loop.
How can we avoid the code to go through each subfolders?
Please see below the code I am using,
I really appreciate your time and hope my request is clear.
'''
Function GetFiles(startPath As String) As Collection
Dim fso As Object, rv As New Collection, colFolders As New Collection, fpath As String
Dim subFolder As Object, f, dMinfold, dtMod
Set fso = CreateObject("Scripting.FileSystemObject")
dMinfold = ThisWorkbook.Sheets("Enter_Date").Cells(2, 1)
colFolders.Add startPath
Do While colFolders.Count > 0
fpath = colFolders(1)
colFolders.Remove 1
'process subfolders
For Each subFolder In fso.getfolder(fpath).subfolders
If subFolder.DateLastModified >= dMinfold Then
colFolders.Add subFolder.Path
End If
Next subFolder
'process files
f = Dir(fso.buildpath(fpath, "*Table.csv"), vbNormal)
Do While f <> ""
f = fso.buildpath(fpath, f)
dtMod = FileDateTime(f)
If dtMod >= dMinfold And Right(f, 3) = "csv" Then
rv.Add f
End If
f = Dir()
Loop
Loop
Set GetFiles = rv
End Function'''
Then I have my code to get transfer data from each file.
Thank you.

I'll put in screenshots to clear up the Get & Transform method, since it is the GUI approach rather than code.
It is possible to filter before loading contents, which will speed things up significantly.
I tried with a few thousand subfolders filtered down to 20, loads instantly.
Here's the initial screen for get data from folder
You can then filter on path. In your case it will be based on the date from the folder name.
Now that it's filtered you can expand the content using the header button.
Inside content, you'll have to expand again to convert from csv to excel table
Choose/rename columns as needed, then hit "close and load" to drop it into excel.
Default is to a new table, but you can "load to" if something more custom is needed.
Here's your output. You can right-click refresh or refresh from vba as needed.
Edit- Just noticed that I used .txt rather than .csv for the files. Might change how a step or two looks in the middle, but the general idea is the same.

Related

Random File Selector?

It's been years since I've used Visual Basic. I downgraded from 2017 to 2010 (The version I was using while I was in school). I figured VB would be the best way to attempt a solution. (Although I'm sure there are other languages that would do it as well.)
I'm looking to get back into programming. Let me get to the problem.
My friend has an ever growing amount of text documents in a folder, and he wants a program to choose one at random, and open it.
I thought I'd put a TextBox with a Button that would let him open the folder where he stores his files. Then this program would read the number of text files in that folder, and randomly generate a number between one and that number, select, and open the document with its default program (if it's text, notepad; if it's DocX then word.)
I've been sitting at a blinking cursor for 45 minutes. I've gone on YouTube for help with this project.
Any advice, or help you guys can give me? Does this need to be simplified?
That sounds like a reasonable strategy to me.
It might be worth displaying some sort of progress to the user, say by putting the name of current file name being read into the status bar, in case there's a long delay reading the file names due to the large number of files in the folder, and/or a slow-running network drive. If you do this, remember to put a DoEvents into your loop to allow screen updates to display.
There's a separate thread on how to open files in their native handler here.
Hope this helps - good luck!
Option Explicit
Public oFSO As Object
Public arrFiles()
Public lngFiles As Long
Sub Main()
Dim sPath As String
sPath = InputBox("Enter folder path", "Folder path")
' clear starting point
lngFiles = 0
Erase arrFiles
Set oFSO = CreateObject("Scripting.FileSystemObject")
Call recurse(sPath)
Randomize
Dim lngRandomFileNumber As Long
lngRandomFileNumber = CLng(lngFiles * Rnd) + 1
MsgBox "This is random file, that will be opened: " & arrFiles(lngRandomFileNumber)
Call CreateObject("Shell.Application").Open(arrFiles(lngRandomFileNumber))
End Sub
Sub recurse(sPath As String)
Dim oFolder As Object
Dim oSubFolder As Object
Dim oFile As Object
Set oFolder = oFSO.GetFolder(sPath)
'Collect file information
For Each oFile In oFolder.Files
lngFiles = lngFiles + 1
ReDim Preserve arrFiles(lngFiles + 1)
arrFiles(lngFiles) = sPath & "\" & oFile.Name
Next oFile
'looking for all subfolders
For Each oSubFolder In oFolder.SubFolders
'recursive call
Call recurse(oSubFolder.path)
Next oSubFolder
End Sub
You can paste this code in any VBA supporting application (MS Access, MS Excel, MS Word), call VBA editor (Shift + F11) and paste this code. After that press F5 and select Main() function. You'll see prompt to enter folder path, and after that you would get random file path.
I think it should be understandable in practice to see what program do
Updated: #Belladonna mentioned it clearly, to open file in default program.
NB: This code is passes through subfolders also, if you want to exclude subfolders, you should comment the recursive call block in recurce function

Update an excel file by multiple users at same time without opening the file

Scenario
I have an excel file that contains data. There are multiple users accessing the file at the same time.
Problem
There will be problem if multiple users tried to input data to that excel file at the same time due to only one user is allowed to open the file at one time
Question
Is there any way whereby I can update the excel file (Eg: add a value to a cell, delete a value from a cell, find a particular cell etc) without opening it so that multiple users can update it at the same time using excel VBA?
I went to the direction of using shared files. But later found out to be excel shared files are very buggy. If use shared file, excel/macro can be very slow, intermittent crashes and sometime the whole file may get corrupted and could not be opened or repaired afterwards. Also depends on how many users use the file, the file size can grow quite big. So it is best not to use shared workbook. Totally not worth trying. Instead if need multiple users to update data simultaneously, it is better to use some database such as MSAccess, MSSql (Update MSSQL from Excel) etc with excel. For my situation since number of users are less, I didn't use any database, instead put a prompt for the user to wait until the other user close that file. Please see the codes below to check if a file is opened and if so, to prompt user. I got this code from stack overflow itself and I modified to suit my needs.
Call the module TestFileOpened as below.
Sub fileCheck()
Call TestFileOpened(plannerFilePathTemp)
If fileInUse = True Then
Application.ScreenUpdating = True
Exit Sub
End If
End Sub
Here plannerFilePathTemp is the temporary file location of your original file. Whenever an excel file opened, a temp file will be created. For example, your original file location is as below
plannerFilePath = "C:\TEMP\XXX\xxx.xlsx"
Thus your temporary file location will be
plannerFilePathTemp = "C:\TEMP\XXX\~$xxx.xlsx"
or in other words, temporary file name will be ~$xxx.xlsx
The following codes will be called upon Call TestFileOpened(plannerFilePathTemp)
Public fileInUse As Boolean
Sub TestFileOpened(fileOpenedOrNot As String)
Dim Folder As String
Dim FName As String
Set objFSO = CreateObject("Scripting.FileSystemObject")
If objFSO.FileExists(fileOpenedOrNot) Then
fileInUse = True
MsgBox "Database is opened and using by " & GetFileOwner(fileOpenedOrNot) & ". Please wait a few second and click again", vbInformation, "Database in Use"
Else
fileInUse = False
End If
End Sub
Function GetFileOwner(strFileName)
Set objWMIService = GetObject("winmgmts:")
Set objFileSecuritySettings = _
objWMIService.Get("Win32_LogicalFileSecuritySetting='" & strFileName & "'")
intRetVal = objFileSecuritySettings.GetSecurityDescriptor(objSD)
If intRetVal = 0 Then
GetFileOwner = objSD.Owner.Name
Else
GetFileOwner = "Unknown"
End If
End Function
I encountered Out of memory issues also when used shared files. So during the process, I figured out the following methods to minimize memory consumption
Some tips to clear memory

Looping through excel file in folder skipping the pre-processed ones (vba)

I'm new to the site and I'm learning VBA.
Basically, I created a code which loops through excel files in a folder and processes some data which are subsequently implemented in a single common excel file with the name of the processed file in column A and all the data I want to record in the following cells.
Since I'm working with a lot of XSL files and the folder is constantly updated with new files, I was wondering which is the easiest way to go through the files once again when the macro starts and skipping the pre-processed files, in order to just record the new ones.
Thanks in advice
Add a function that checks if your file is already processed. Assuming that you have the list of processed files in column A of 1st Worksheet:
Function FileAlreadyProcessed(filename As String) As Boolean
Dim r As Range, matchRes As Variant
Set r = ThisWorkbook.Sheets(1).Range("A:A")
matchRes = Application.Match(filename, r, 0)
FileAlreadyProcessed = (Not IsError(matchRes))
End Function
This function will search Col A for the filename. When found, the function will return true, else false. So add a check in your loop
if not FileAlreadyProcessed(fileName) then
... do your processing
endif

How do you retrieve the file path from a checked list box that has multiple subfiles from different directories?

I've got a checked list box that populates files and subfiles from a selected location using a folder browser dialog. What I'm trying to accomplish is retrieve each location/directory of every checked item within that list. I'm using these locations as a spot to copy and paste new files into if that makes sense.
I should add that I'm new to coding, and I don't know how to do something as "complex" as this. This is also my first post and I apologize if this is an easier type question for you guys.
For Each file As IO.FileInfo In ListBox1.CheckedItems
Dim NewFileName As String = FindListBox2ItemThatContains(file.Name)
Dim lstfiles As String = Directory.GetFiles(FBD2.SelectedPath)
If NewFileName.Trim.Length > 0 Then
IO.File.Copy(file.FullName, IO.Path.Combine(FBD2.SelectedPath, "item parent folder", NewFileName.Trim), True)
End If
ProgressBar1.Increment(+1)
Next

Is there a way to save all hyperlinked documents in excel?

I know this question has been answered before, but the problem I'm facing is a bit different, and this is why I'm asking for your help :)
So, I'm working with multiple excel files that contain multiple hyperlinks that lead to documents such as Excel files, PDFs, DOCs and sometimes even images. The problem with these hyperlinks is that they are not leading to a "normal" website, but to a special internal software, that in its turn links to a local address on my computer that contatins the desired file. That means that there is no direct link that could be grabbed with a simple VBA code.
Let's have an example, assuming the internal software name is "John":
I see in the Excel documents this link: John://3434545345/345345345
When I click on it, it opens the file, which is located, for example, in: C:/local/Cutekitten.pdf
After this long intro, my question is: Is there a way to automate the process of saving each document, instead of manually opening it and saving it? Could it be solved with a VBA code? Or does is require a different approach? I was actually thinking to bypass this problem by finding a way to open all hyperlinks at once with VBA, and then maybe find some code (not VBA?) that saves all open documents.
P.S Please keep in mind that I can't download EXE files or any other "suspicious" files due to workplace restrictions.
Any help will be much appreciated,
Thanks! :)
You may try this
'''
Input: test.xlsx
name link
1 location/file1.jpg
2 location/file2.xlsx
3 location/file3.pdf
4 location/file4.mp4
'''
from openpyxl import load_workbook
wb = load_workbook('test.xlsx')
print wb.get_sheet_names()
# ['Sheet 1', 'Sheet 2', 'Sheet 3']
ws1 = wb['Sheet 1']
## alternate -> worksheet2 = wb2.get_sheet_by_name('Sheet2')
import urllib
import os
## Result directory
directory = 'result'
if not os.path.exists(directory):
os.makedirs(directory)
for row in ws1:
if '.exe' in row[1].value:
continue
print row[0].value, '\t', row[1].value
urllib.urlretrieve (row[1].value, directory+'/'+row[1].value)
'''
Output:
Table 1 None
None None
name link
1 location/file1.jpg
2 location/file2.xlsx
3 location/file3.pdf
4 location/file4.mp4
'''
Sub PathsText()
Dim List(), Path As String
Dim i, x As Integer
Dim s As InlineShape
Dim fso As FileSystemObject, ts As TextStream
Set fso = New FileSystemObject
Set ts = fso.OpenTextFile("C:\MyFolder\List.txt", 8, True)
With ts
.WriteLine (ActiveDocument.InlineShapes.Count)
End With
For Each s In ActiveDocument.InlineShapes
Path = s.LinkFormat.SourcePath & "\" _
& s.LinkFormat.SourceName
With ts
.WriteLine (Path)
End With
Next s
End Sub
I guess the important part is if "Path = s.LinkFormat.SourcePath" works.
If so, if you have a ton of excel files to do it for, I'd put those (the ones with the links) in one folder. I'd make a new empty workbook with one button on a single sheet to call code. The button would "For Each workbook in directory, For Each sheet in workbook, For Each link on sheet" add source path to a text file.
Once you had a list of paths the next step would be obvious. I wish I had demo code, but I migrated to c# years ago. My monstrously slow cell-by-cell search, modify and graph inventions threatened to bury me.