Crawling through Zip files - vba

I'm trying to crawl through a certain drive and grab data off of certain .xls files that are buried in sub-directories. The drive is over a TB, and the folders don't all have the same hierarchy, so I'm crawling through all of them. So far, the script works great.
The problem is, there are zipped files in the drive. At least half the files are in zipped format. How can I crawl through these files as well?
Here is the part of my code that crawls through the sub-directories. There is another function "TrailingSlash" which just appends a "\" to the string if it doesn't already have one. I give credit to the author in my comments.
Public Function recursiveDir(colFiles As Collection, strFolder As String, strFileSpec As String, bIncludeSubfolders As Boolean) as Collection
'From Ammara.com/access_image_faq/recursive_folder_search.html
'Recursive function to search document tree from specific file extension
Dim strTemp As String
Dim colFolders As New Collection
Dim vFolderName As Variant
Dim colFiles As New Collection
Dim counter As Integer
'Add files in strFolder matching strFileSpec to colFiles
strFolder = TrailingSlash(strFolder)
strTemp = Dir(strFolder & strFileSpec)
On Error Resume Next
Do While strTemp <> vbNullString
colFiles.Add (strFolder & strTemp)
counter = counter + 1
Debug.Print ("files found: " & counter)
strTemp = Dir
Loop
If bIncludeSubfolders Then
'Fill colFolders with list of subdirectories of strFolder
strTemp = Dir(strFolder, vbDirectory)
Do While strTemp <> vbNullString
If (strTemp <> ".") And (strTemp <> "..") Then
If (GetAttr(strFolder & strTemp) And vbDirectory) <> 0 Then
colFolders.Add strTemp
End If
End If
strTemp = Dir
Loop
'Call recursiveDir for each subfolder in colFolders
For Each vFolderName In colFolders
Call recursiveDir(colFiles, strFolder & vFolderName, strFileSpec, True)
Next vFolderName
End If
recursiveDir = colFiles
End Function
The function adds all the path strings to the collection "colFolders", which I then use to open and extract data from. I'm now thinking there may not be a simple way to return a string path to a file within a zipped folder. There may need to be a separate function that is called when this function encounters a zip, that in turn crawls through the zipped folder and extracts the specific file to a local destination (as long as I don't have to extract the whole folder, we should be good).
I'm kind of lost in what I should do. Googleing around points me towards using shell.Application. I know nothing of shells, is this the path I should take?
Thanks SO - you all are awesome!

Try this code instead to search through subfolders:
Sub SO()
Dim x, i
x = GetFiles("C:\Users\SO\Folder", "*.xls*", True) '// x becomes an array of files found
For Each i In x
Debug.Print i
Next i
End Sub
'-------------------------------------------------
Function GetFiles(StartPath As String, FileType As String, SubFolders As Boolean) As Variant
StartPath = StartPath & IIf(Right(StartPath, 1) = "\", vbNullString, "\") 'Sanity check
GetFiles = Split(Join(Filter(Split(CreateObject("WScript.Shell").Exec("CMD /C DIR """ & StartPath & FileType & """ " & _
IIf(SubFolders, "/S", vbNullString) & " /B /A:-D").StdOut.ReadAll, vbCrLf), ":"), "#"), "#")
End Function
But for zip files, there isn't really anything native to windows that will allow you to do this other than the CreateObject("Shell.Application").Namespace(zipName).Items method.
I prefer to use 7-zip which is free, open-source and has a great command line utility which means you can access it via VBA too using the CreateObject("WScript.Shell") method (like above)

Related

DIR does not recognize wildcard "*"

The following code does not find the file:
xTempTmtl = "Z:\20.0 Global storage\20.1 Design Packages*" & xNewName
where xNewName = "SLR-D&C-MI0-000-TRS-007199.Pdf"
but
xTempFol = Dir("Z:\20.0 Global storage\20.1 Design Packages\DP01.1 Stops - Zone C (North)\02. Transmittals\" & xNewName)
finds the file.
Problem is that the file xNewName could be in any one of 80 folders(Dp01.1..., DP01.2....) etc, then in \02. Transmittals\
If I put a \ after the *, I get Bad file name error.
Why is the wildcard "*" not recognized?
This happens on two separate machines, one running EXCEL2010 on a Windows& PC and the other running EXCEL365 on a Windows10 laptop.
Wildcards are used to return all file or folder names in a directory. Using the Dir() function with parameters changes the path in which the function will search for files. Subsequent calls to the Dir() function will return the next file or folder after the previous call. The Dir() will not search subdirectories for a file.
You will need to do a recursive set the Dir() Attributes parameter to vbDirectory to return the subfolder names then search each subfolder.
You can find plenty of examples that use the FileSystemObject to recursively search subfolder for a file. I thought that it would be interesting to write one that uses the Dir() function.
Function FindFile(ByVal folderName As String, ByVal FileName As String, Optional ByRef FoundFile As String) As String
Dim search As String
Dim dirList As New Collection
If Not Right(folderName, 1) = "\" Then folderName = folderName & "\"
search = Dir(folderName & "\*", vbDirectory)
While Len(search) > 0
If Not search = "." And Not search = ".." Then
If GetAttr(folderName & search) = 16 Then
dirList.Add folderName & search
Else
If LCase(search) = LCase(FileName) Then
FoundFile = folderName & FileName
FindFile = FoundFile
Exit Function
End If
End If
End If
search = Dir()
Wend
Dim fld
For Each fld In dirList
If Len(FoundFile) > 0 Then
FindFile = FoundFile
Exit Function
Else
FindFile = FindFile(CStr(fld), FileName, FoundFile)
End If
Next
End Function

Excel VBA: Search for a directory

In vba, I would like to search through directories for a specific directory name. Ideally the searching time would be fast (similar from a windows search).
from different sources, I could build a script (given bellow) with a recursive sub program. The script works but it is very slow as soon as the hierarchy is a little complex.
Is there a way to make the search faster?
Sub GetFolder(Folder As String, searchF As String, colFolder As Collection)
Dim SubFolder, subF As New Collection, sf As String
If Right(Folder, 1) <> "\" Then Folder = Folder & "\"
If Dir(Folder & searchF, vbDirectory) <> "" Then colFolder.Add Folder & searchF & "\"
sf = Dir(Folder, vbDirectory)
Do While Len(sf) > 0
If sf <> "." And sf <> ".." Then
If (GetAttr(Folder & sf) And vbDirectory) <> 0 Then
subF.Add Folder & sf
End If
End If
sf = Dir()
Loop
For Each SubFolder In subF
GetFolder CStr(SubFolder), searchF, colFolder
Next
End Sub
I think you are underestimating the hierarchy size. Change your code to this one to see how many folders you are recursing through.
Option Explicit
Sub GetFolder(Folder As String, searchF As String, colFolder As Collection, ByRef counter As Long)
Dim SubFolder, subF As New Collection, sf As String
If Right(Folder, 1) <> "\" Then Folder = Folder & "\"
If Dir(Folder & searchF, vbDirectory) <> "" Then colFolder.Add Folder & searchF & "\"
sf = Dir(Folder, vbDirectory)
Do While Len(sf) > 0
If sf <> "." And sf <> ".." Then
If (GetAttr(Folder & sf) And vbDirectory) <> 0 Then
Debug.Print Folder & sf
counter = counter + 1
subF.Add Folder & sf
End If
End If
sf = Dir()
Loop
For Each SubFolder In subF
GetFolder CStr(SubFolder), searchF, colFolder, counter
Next
End Sub
Public Sub TestMe()
Dim newC As New Collection
Dim colChecked As New Collection
Dim counter As Long
GetFolder "C:\Users\<username>\Desktop\BA Tools", "v", newC, counter
Debug.Print counter
End Sub
What is the number that you get at the end of the code, when you run it?

Replace text in documents in subfolders vba

I found this thread with the same issue as mine, but I've copied the code into my project, and it doesn't seem to work.
VBA macro: replace text in word file in all sub folders
I was stepping through the code, and it gets to line 32 (under the For Each varItem in colSubFolders) but then it skips right over the find/replace section to the end of the code. Is the problem in my file format?
EDIT: Additionally, when I get to varitem in ln 31, the value of "varitem" is the name of the folder, not the names of the word documents in the folder: I think this is where the issue is.
Sub DoLangesNow()
Dim file
Dim path As String
Dim strFolder As String
Dim strSubFolder As String
Dim strFile As String
Dim colSubFolders As New Collection
Dim varItem As Variant
' Parent folder including trailing backslash
'YOU MUST EDIT THIS.
strFolder = "L:\Admin\Corporate Books\2015\2014 Consents macro\company Annual Consents"
' Loop through the subfolders and fill Collection object
strSubFolder = Dir(strFolder & "*", vbDirectory)
Do While Not strSubFolder = ""
Select Case strSubFolder
Case ".", ".."
' Current folder or parent folder - ignore
Case Else
' Add to collection
colSubFolders.Add Item:=strSubFolder, Key:=strSubFolder
End Select
' On to the next one
strSubFolder = Dir
Loop
' Loop through the collection
For Each varItem In colSubFolders
' Loop through word docs in subfolder
'YOU MUST EDIT THIS if you want to change the files extension
strFile = Dir(strFolder & varItem & "\" & "*.doc")
Do While strFile <> ""
Set file = Documents.Open(FileName:=strFolder & _
varItem & "\" & strFile)
Use CMD to get all the files into an array and work with that instead - quicker and cleaner.
Sub S_O()
Dim fileArray As Variant
fileArray = Filter(Split(CreateObject("WScript.Shell").Exec("CMD /C DIR """ & strFolder & "\*.doc*"" /S /B /A:-D").StdOut.ReadAll, vbCrLf), ".")
For Each fil In fileArray
'//
'// Insert your code for doing the replacements here
'// e.g. Workbooks.Open(fil)
'// ...
Next
End Sub

VBA code to grab all files in folder is not finding files

I am trying to set up a macro to pull all excel files in a folder into a database in access. I have the below code, but when I run the macro, it errors out into "No Files Found," so intFile = 0. However, there are files in the chosen folder. Why is it not finding them? I think I messed up the linking piece too but one problem at a time. I am obviously pretty new to VBA, so any help would be appreciated!
Thanks,
Option Compare Database
Option Explicit
'code will link to excel and pull site survey files into access tables
'Setting the path for the directory
Const strPath As String = "S:\LOG\PURCHASI\Daniel Binkoski\Outlook Attachments\R7398Z Look Forward Daily Snapshot"
'FileName
Dim strFile As String
'Array
Dim strFileList() As String
'File Number
Dim intFile As Integer
Sub Sample()
strFile = Dir(strPath & "*.xlsx")
'Looping through the folder and building the file list
strFile = Dir(strPath & "*.xlsx")
While strFile <> ""
'adding files to the list
intFile = intFile + 1
ReDim Preserve strFileList(1 To intFile)
strFileList(intFile) = strFile
strFile = Dir()
Wend
'checking to see if files where found
If intFile = 0 Then
MsgBox "No Files Found"
Exit Sub
End If
'going through the files and linking them to access
For intFile = 1 To UBound(strFileList)
DoCmd.TransferSpreadsheet acLink, , _
strFileList(intFile), strPath & strFileList(intFile), True, "A1:M50"
Next
MsgBox UBound(strFileList) & "Files were linked"
End Sub
try:
strFile = Dir(strPath & "\*.xlsx", vbNormal)
or add a final "\" onto your strPath value
You need another path separator to show you're looking in a directory, not at one.
I often use something like:
Dir(strPath & IIf(Right(strPath, 1) = "\", vbNullString, "\"))
as a check to ensure that the path always ends in a trailing backslash.

Collection Maximum Size

Here's my code:
Sub isdofsodjisf48023jroi23f984444444jiodfiosj12348023jroi23f98()
Dim colFiles As New Collection
RecursiveDir colFiles, "C:\Documents and Settings\Alex Gordon\Desktop\testing\files\", "*.xls", True
Dim vFile As Variant
For Each vFile In colFiles
Call writeincells(vFile)
Next vFile
End Sub
Public Function RecursiveDir(colFiles As Collection, _
strFolder As String, _
strFileSpec As String, _
bIncludeSubfolders As Boolean)
Dim strTemp As String
Dim colFolders As New Collection
Dim vFolderName As Variant
'Add files in strFolder matching strFileSpec to colFiles
strFolder = TrailingSlash(strFolder)
strTemp = Dir(strFolder & strFileSpec)
Do While strTemp <> vbNullString
colFiles.Add strFolder & strTemp
strTemp = Dir
Loop
If bIncludeSubfolders Then
'Fill colFolders with list of subdirectories of strFolder
strTemp = Dir(strFolder, vbDirectory)
Do While strTemp <> vbNullString
If (strTemp <> ".") And (strTemp <> "..") Then
If (GetAttr(strFolder & strTemp) And vbDirectory) <> 0 Then
colFolders.Add strTemp
End If
End If
strTemp = Dir
Loop
'Call RecursiveDir for each subfolder in colFolders
For Each vFolderName In colFolders
Call RecursiveDir(colFiles, strFolder & vFolderName, strFileSpec, True)
Next vFolderName
End If
End Function
Public Function TrailingSlash(strFolder As String) As String
If Len(strFolder) > 0 Then
If Right(strFolder, 1) = "\" Then
TrailingSlash = strFolder
Else
TrailingSlash = strFolder & "\"
End If
End If
End Function
I am filling up a Collection with a list of filenames in a directory structure.
I have 2000 files, but the Collection only return 256. Does anyone know if there is a maximum number that it won't go past?
If so, can you please suggest a better way to code this macro so that it captures all 2000 files?
The code is working OK in Excel 2007. Perhaps what is happening is that you are trying to Watch the Collection in Debug mode. The Debugger shows only the first 256 items.
How about a disconnected recordset? This thread is about VBScript, but it is very similar to VBA:
How do I sort arrays using vbscript?