How do I count the number of times a value is repeated across different files? - vba

I want to know the number of times a value is repeated, for example 288, and how many values there are in total (every number) in many files with the same format.
For one worksheet I would just use =COUNTIF(F:F;288) and =COUNTA(F:F)
But now I have to do it with more than 30000 xlsx files inside a folder.
My first intent was to merge them into one file like this and then count with this solution, but it stopped after 5279 tabs, I guess for some kind of limitation.
All my files are in the same folder (H:\Macro\positions) and the values are only expected in column F.
There are between 100-600 values per file, around 30000 files.
The operation has to be done just once, I don't mind waiting some hours for it to finish.
How would you do it?

Try the code below and follow the comments - basically the code opens each spreadsheet in the given folder, loops through the sheets in that workbook, runs your COUNTIF formula for each sheet and keeps a record of the total count.
Option Explicit
Sub CheckForValue()
Dim objFso As FileSystemObject '<-- add Microsoft Scripting Runtime as a reference
Dim objFile As File
Dim wbToCheck As Workbook
Dim wsToCheck As Worksheet
Dim strPath As String
Dim varValue As Variant
Dim lngValueCount As Long
Dim lngTotal As Long
Dim wsf As WorksheetFunction
On Error Goto CleanUp
strPath = "H:\Macro\positions"
Set objFso = New FileSystemObject '<-- access to file system
varValue = 288 '<-- value you are looking for
lngTotal = 0 '<-- total count of value you are looking for
Set wsf = Application.WorksheetFunction '<-- shortcut to WorksheetFunction
' iterate files in folder
For Each objFile In objFso.GetFolder(strPath).Files
' only check spreadsheets
If objFile.Type = "Microsoft Excel Worksheet" Then
' get reference to workbook
Set wbToCheck = Workbooks.Open(objFile.Path)
' iterate worksheets
For Each wsToCheck In wbToCheck.Worksheets
' your original formula
lngValueCount = wsf.CountIf(wsToCheck.Range("F:F"), varValue)
' add to total
lngTotal = lngTotal + lngValueCount
Next wsToCheck
' close without saving changes
wbToCheck.Close SaveChanges:=False
End If
Next objFile
' final count of value you are looking for
Debug.Print "Total is: " & lngTotal
CleanUp:
' error handling
If Err.Number <> 0 Then
Debug.Print Err.Description
End If
Set objFile = Nothing
Set objFso = Nothing
End Sub
Based on your comment that The operation has to be done just once, I don't mind waiting some hours for it to finish then the above code will do that, just grinding through sheets checking for the value. If you want to improve the speed you can use the following code before the For loop to help:
Application.ScreenUpdating = False
Application.DisplayAlerts = False
Application.EnableEvents = False
And then afterward turn the settings back (after the CleanUp: statement):
Application.ScreenUpdating = True
Application.DisplayAlerts = True
Application.EnableEvents = True

Related

Find and replace specific string inside a formula in multiple excel workbooks

I have a directory with 6 sub-folders and ~300 excel workbooks(Growing every day).
Each workbook has multiple formulas (~1200 per workbook) that reference a CSV data dump stored on a server path.
My issue is that excel treats the CSV data dump as "dirty data" and prompts warnings every time a workbook is opened claiming it can't update the links(But when the links are checked, excel then says there's no issue).
In all my research I've found there doesn't seem to be a way to fix this other than replace the datasource with a .xsl file which excel doesn't have any issues referencing.
What I need to do, is perform a find and replace on ~300 workbooks, find the CSV server path inside the formulas and replace it with the new server path for the .xls file.
I've tried "Sobolsoft's Excel Find and Replace" software, but that doesn't seem to want to look inside formulas to replace. I've used "Easy-XL" and "Kutools" both of which only work on open workbooks (Which I could live with, if I had to open 20-50 workbooks at a time, run the find and replace, then open the next batch) but neither of them wanted to work either.
I've used the following macro to unprotect/protect each workbook in the directory which works perfectly
Const cStartFolder = "M:\Transfer\DrillHole_Interaction\4.For_Survey" 'no slash at end
Const cFileFilter = "*.xlsm"
Const cPassword = "" 'use empty quotes if blank
Sub UnprotectAllWorksheets()
Dim i As Long, j As Long, arr() As String, wkb As Workbook, wks As Worksheet
ExtractFolder cStartFolder, arr()
On Error Resume Next
j = -1: j = UBound(arr)
On Error GoTo 0
For i = 0 To j
Set wkb = Workbooks.Open(arr(i), False)
For Each wks In wkb.Worksheets
wks.Protect cPassword, True, True
Next
wkb.Save
wkb.Close
Next
End Sub
Sub ExtractFolder(Folder As String, arr() As String)
Dim i As Long, objFS As Object, objFolder As Object, obj As Object
Set objFS = CreateObject("Scripting.FileSystemObject")
Set objFolder = objFS.GetFolder(Folder)
For Each obj In objFolder.SubFolders
ExtractFolder obj.Path, arr()
Next
For Each obj In objFolder.Files
If obj.Name Like cFileFilter Then
On Error Resume Next
i = 0: i = UBound(arr) + 1
On Error GoTo 0
ReDim Preserve arr(i)
arr(i) = objFolder.Path & Application.PathSeparator & obj.Name
End If
Next
End Sub
If it would help, I'm also open to copying from a 'Master' workbook and copying the specific range into each other workbook (Copy range to range for each book) but I'm at my wits end and do not know how to proceed.
Any help would be appreciated.
No need to find and replace the csv fullname (path & filename) within all formulas, just change the links source at once within each workbook.
Try this within a loop through all workbooks that need to be changed.
Dim Wbk As Workbook
Application.DisplayAlerts = False
Set Wbk = Workbooks.Open(Filename:="WbkTarget.Fullname", UpdateLinks:=3)
With Wbk
.ChangeLink _
Name:="CsvFile.Fullname", _
NewName:="XlsFile.Fullname", _
Type:=xlExcelLinks
.Save
.Close
End With
Application.DisplayAlerts = True
where:
WbkTarget.Fullname: Path and name of the workbook with the link to be replaced
CsvFile.Fullname: Path and name of the csv file to be replaced
XlsFile.Fullname: Path and name of the xls that replaces the csv file

Excel loop macro ending early and needing to keep files open to copy several loops(different files)

I'm having a bit of a problem with this VBA code
Sub upONGOING_Train1()
ScreenUpdating = False
'set variables
Dim rFndCell As Range
Dim strData As String
Dim stFnd As String
Dim fCol As Integer
Dim oCol As Integer
Dim SH As Worksheet
Dim WS As Worksheet
Dim strFName As String
Dim objCell As Object
Set WS = ThisWorkbook.Sheets("Trains")
For Each objCell In WS.Range("L3:L100")
oCol = objCell.Column
strFName = WS.Cells(, oCol).Offset(objCell.Row - 1, 0)
On Error GoTo BLANK: 'skip macro if no train
Workbooks.Open Filename:=strFName 'open ongoing report
Set SH = Worksheets("Trains") 'set sheet
stFnd = WS.Cells(, oCol).Offset(objCell.Row - 1, 2).Value 'set connote
With SH
Set rFndCell = .Range("C3:C1100").Find(stFnd, LookIn:=xlValues)
If Not rFndCell Is Nothing Then
fCol = rFndCell.Column
WS.Cells(, oCol).Offset(objCell.Row - 1, 3).Resize(1, 6).Copy
SH.Cells(, fCol).Offset(rFndCell.Row - 1, 10).Resize(1, 6).PasteSpecial xlPasteValues 'paste values in ongoing report if connote found
ActiveWorkbook.Save 'save ongoing report
ActiveWorkbook.Close 'close ongoing report
Else 'Can't find the item
End If
End With
BLANK:
Next objCell
ScreenUpdating = True
End Sub
What I want it to do is - for every row in L3:L100
Open file listed in column "L" (if there or skip line to next one) and go to sheet
Match value from original sheet column "N" to "C3:C1100" in newly opened sheet
Copy columns "O:T" and paste relative to the matching value in the opened sheet(M:R) and save
However when I leave a gap of 2 rows it gives me the error for file not found instead of proceeding to the next loop like it does when there is only 1 row missing.
Seems i can't post images yet.
Also if anyone can point me in a good direction on how to open the sheet in the cell reference only if it is not already open it will usually only have 2 files to use (max of 4 at end of quarter).
Its just too much trouble to click OK on all the windows that pop up when you try to reopen an already open workbook.
If its any help to get your head around it.
I have 2 separate reports for 2 clients(new each quarter so max of 4 sheets at a time) that will already have the names to be searched (2 sheets in each book).
Any help would be greatly appreciated
Thanks heaps
Thanks to those who have put forth suggestions and code.
I'll them out tomorrow and failing that I've just come up with another idea that to re-purpose some other code I have but didn't realize would help.
The code basically copies what I need to a blank tab and deletes rows with a given value - with some formulas to help sort this would give me a block of rows with no breaks all going to the same destination file.
Thus allowing me to run the (a bit more streamlined Thanks everyone) loop over the remaining rows.
On Error GoTo BLANK
Workbooks.Open Filename:=strFName
Change the above into this:
On Error Resume Next
Workbooks.Open Filename:=strFName
If Err.Number <> 0 Then Goto Blank
As to hpw keep the workbook open, you can leave it open (no .close) but then when you want to open it check first if it is open (i.e. using Workbooks("name")), with some error handling using the same mechanism as above, if error exists then the wb is not already open, you open it.
Finally, avoid counting on the Active stuff, such as the ActiveWorkbook`. Instead, make an explicit reference to you wb, i.e.:
Set wb = Workbooks.Open(Filename:=strFName)
Set SH = wb.Worksheets("Trains")
to consider only not blank cells you can use SpecialCells() method of Range object and leave off any On Error GoTo statements, that should be used in very a few limited cases (one of which we'll see in a second)
furthermore you're using some uselessly long winded 'loops' to reference your relevant cells, for instance:
WS.Cells(, oCol).Offset(objCell.Row - 1, 0)
is equivalent to objCell itself!
and there are some more examples of that kind
finally, let's come to the workbooks open/close issue
you could:
use a Dictionary object to store the name of all opened workbooks so as to leave then open throughout your macro and close them all by the end of it
adopt a helper function that tries to set the wanted sheet (i.e. "Trains") in the wanted workbook (i.e. the one whose name is the current objCell value) and return False if not successful
all what above in this refactoring of your code:
Sub upONGOING_Train1bis()
Dim rFndCell As Range
Dim SH As Worksheet
Dim objCell As Range
Dim shtDict As New Scripting.Dictionary '<--| this is the dictionary that will store every opened workbook name as its 'keys'
Dim key As Variant
' Dim dec As String '<--| do you actually need it?
Application.ScreenUpdating = False
With ThisWorkbook.Sheets("Trains") '<-- reference your working worksheet
' dec = .Range("L1") '<-- what's this for? in any case take it out of for loops since its value doesn't depend on current loop variables
For Each objCell In .Range("L3:L100").SpecialCells(xlCellTypeConstants) '<--| loop through L3:L100 range not blank cells only
If TrySetWorksheet(objCell.Value, "Trains", SH) Then '<--|Try to set the wanted worksheet in the wanted workbook: if successful it'd retrun 'True' and leave you with 'SH' variable set to the wanted worksheet
shtDict(SH.Parent.Name) = shtDict(SH.Parent.Name) + 1
Set rFndCell = SH.Range("C3:C1100").Find(objCell.Offset(, 2).Value, LookIn:=xlValues, lookAt:=xlWhole) '<--| specify at least 'LookIn' and 'LookAt' parameters
If Not rFndCell Is Nothing Then rFndCell.Offset(, 10).Resize(, 6).Value = objCell.Offset(, 3).Resize(, 6).Value
End If
Next objCell
End With
For Each key In shtDict.Keys '<--|loop through opened workbooks dictionary keys
Workbooks(key).Close True '<--| close workbook whose name corresponds to current dictionary key
Next
Application.ScreenUpdating = True
End Sub
Function TrySetWorksheet(fileName As String, shtname As String, sht As Worksheet) As Boolean
Set sht = Nothing
On Error Resume Next
Set sht = Workbooks(Right(fileName, Len(fileName) - InStrRev(fileName, "\"))).Worksheets(shtname) '<--| try looking for an already open workbook with wanted name and wanted sheet
If sht Is Nothing Then Set sht = Workbooks.Open(fileName:=fileName).Worksheets(shtname) '<--| if not found then try opening the wanted workbook and set the wanted sheet in it
TrySetWorksheet = Not sht Is Nothing '<--| set the return value to the final result of attempts at locating the wanted sheet
End Function

Store all currently opened excel workbooks and open it later

I am currently facing a problem in which I want to:
1.Store all currently opened excel workbooks in a an array
2.Save and close the workbook
3.Open back all opened workbooks
4.Focus back to a specific workbook
The current code i have:
For Each wb In Application.Workbooks
wb.Save
Next wb
Works as expected but my different excel workbooks keeps 'flashing' which is kind of irritating, thus the need to save and close all.
I do understand that to focus back to a specific workbook u can use activate function. If i do an set array inside the 'For each loop', it will not work as it will become a double for loop.
As i'm new to VBA, i would really appreciate any input from you all.
Thank you!
I've given you two different options in this code. Either using a collection or an array.
You can step through a collection using For Each item in Collection loop while the array would need a For..Next loop.
Sub All_OpenWorkBooks_Collection()
Dim wrkBk As Workbook
''''''''''''''''''''''''''''''''''''''''''''''''''''''
'Add to a collection '
''''''''''''''''''''''''''''''''''''''''''''''''''''''
Dim vItem As Variant
Dim colWorkBooks As Collection
Set colWorkBooks = New Collection
For Each wrkBk In Workbooks
If wrkBk.Name <> ThisWorkbook.Name Then
colWorkBooks.Add wrkBk.FullName
wrkBk.Close SaveChanges:=True
End If
Next wrkBk
Set wrkBk = Nothing
For Each vItem In colWorkBooks
Workbooks.Open (vItem)
Next vItem
''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
'Set a reference to a specific workbook - can then use wrkBk to refer to it. '
''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
Set wrkBk = Workbooks("Copy (4) of New Microsoft Excel Worksheet.xlsx")
wrkBk.Activate
End Sub
'------------------------------------------------------------------------
Sub All_OpenWorkbooks_Array()
Dim wrkBk As Workbook
''''''''''''''''''''''''''''''''''''''''''''''''''''''''
'Add to an array. '
''''''''''''''''''''''''''''''''''''''''''''''''''''''''
Dim x As Long
Dim arrWrkBk() As Variant
ReDim arrWrkBk(1 To Workbooks.Count)
For x = Workbooks.Count To 1 Step -1
If Workbooks(x).Name <> ThisWorkbook.Name Then
arrWrkBk(x) = Workbooks(x).FullName
Workbooks(x).Close SaveChanges:=True
End If
Next x
For x = 1 To UBound(arrWrkBk)
If arrWrkBk(x) <> "" Then
Workbooks.Open (arrWrkBk(x))
End If
Next x
''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
'Set a reference to a specific workbook - can then use wrkBk to refer to it. '
''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
Set wrkBk = Workbooks("Copy (4) of New Microsoft Excel Worksheet.xlsx")
wrkBk.Activate
End Sub
Edit: Note I step backwards through the array loop - as it's counting open workbooks and closing them the number of open workbooks goes down as the loop progresses (so when it got to loop number 4 there's a good chance that workbook number 4 has already been closed).
Edit 2: The comment on workspaces may be just what you're after - I'd check that out first.
Would adding
.ScreenUpdating = False
before your loop help?
And
.ScreenUpdating = true
after to switch it back on.

Trying to iterate through some workbooks from a list of workbooks, getting out of range errors

I have a problem. I'm guessing its easier to first write the code, and then explain it so here goes:
Sub Test()
Dim myHeadings() As String
Dim i As Long
Dim path As String
Dim pathtwo As String
Dim currentWb As Workbook
Dim openWb As Workbook
Dim openWs As Worksheet
Set currentWb = ActiveWorkbook
path = "C:\pathto\"
pfile = Split("File1,File2,File3", ",")
myHeadings = Split("Januari,Februari,Mars,April,Maj,Juni,Juli,Augusti,September,Oktober,November,December", ",")
For j = 0 To UBound(pfile)
pathtwo = path & pfile(j) & ".xlsx"
i = 0
If IsFile(pathtwo) = True Then
For i = 0 To UBound(myHeadings)
Set openWb = Workbooks.Open(pathtwo)
Set openWs = openWb.Sheets(myHeadings(i))
If openWs.Range("C34") = 0 Then
currentWb.Sheets("Indata").Cells(70, i + 27 + 12*j.Value = ""
Else
currentWb.Sheets("Indata").Cells(70, i + 27 + 12*j).Value = openWs.Range("C34")
End If
Next i
End if
Workbooks(openWb.Name).Close
Next j
End sub
What I want to pick a file from the pfile list, iterate through all its sheets defined in myHeadings and deduct the value at C34 (in reality there are plenty more values deducted, but to keep it short). After this I want to Close the file, go to the next file and do the same thing until all the Three files (again, in reality there are more, which some of them does not exist yet).
The function "IsFile" is
Function IsFile(fName As String) As Boolean
'Returns TRUE if the provided name points to an existing file.
'Returns FALSE if not existing, or if it's a folder
On Error Resume Next
IsFile = ((GetAttr(fName) And vbDirectory) <> vbDirectory)
End Function
written by iDevlop at stackoverflow, in this thread: VBA check if file exists
The reason why I have
currentWb.Sheets("Indata").Cells(70, i + 27 + 12*j).Value = openWs.Range("C34")
is because I want to start to write my data into currentWb at AA70 (Row 70, column 27). j*12 is because it is "periodic" depending on which file it is (the file file1 corresponds to 2015, file2 to 2016 etc), and hence in my summary I have it month and yearwise.
The problem arises though when I run this macro, at the first file at the sheet Mars I get out of range, but Before I added the iteration of files, there was not any subscript out of range at the first file. Is there anyone who can see how this can be?
Please note that indentation and so on may be somewhat off as I copied this from a much larger file with many rows in between with irrelevant code.
This isnt the right answer for your specific question but this is how I have done something similar and might help you to see how i did it. Basically what this is doing is opening up a CSV and copying the entire sheet and pasting it into a workbook. I was consolidating like 20 CSV dumps into 1 workbook to make it easier to dig through the stuff.
Regarding Dir()
You can invoke Dir with 2 arguments or with no arguments. You initialize it with 2 arguments the pathway and the attributes (which is optional). The 2nd time I am calling Dir in this sub it is without any arguments. What this does is fetch the subsequent files.
Sub Add_Sheets()
Dim ws As Worksheet
Dim PasteSheet As Worksheet
Dim wb As Workbook
Set wb = Application.Workbooks.Open("C:\Users\Desktop\CSV\All.xlsx") 'Location of where you want the workbook to be
StrFile = Dir("c:\Users\Desktop\CSV\*.csv") 'Dir of where all the CSVs were.
Do While Len(StrFile) > 0
Debug.Print StrFile
Application.Workbooks.Open ("c:\Users\Desktop\CSV\" & StrFile)
Set ws = ActiveSheet
ws.Range("A1:C" & rows.Count).Select 'Selecting Specific content on the worksheet
Selection.Copy
wb.Activate
wb.Worksheets.add(After:=Worksheets(Worksheets.Count)).name = StrFile 'Setting the sheet name to the name of the CSV file
Range("A1").PasteSpecial Paste:=xlPasteValues
StrFile = Dir
Loop
End Sub

Manipulate/copy .CSV data, without opening the file?

I'm trying to optimize some code that takes some test data stored in CSV files does some analysis and copies their data into an excel sheet. This code is often run on hundreds of tests at a time, and its taking about 4.5 seconds per test so it can take hours to complete at times.
I looked up some optimization techniques and cut it down by about .25 seconds per test but I think the bulk of the time is being taken up by excel having to "open" the individual files before it can do anything with them. Is there a way to do this more efficiently?
I am open to answers that involve using another language to compile the files into one big file if that would make things quicker.
I would open them as text rather than workbooks:
Sub ReadCSV()
Dim MyString As String
Open "C:\path\text.csv" For Input As #1 ' Open file for input.
Do While Not EOF(1) ' Loop until end of file.
Line Input #1, MyString ' Read a line into variable
Debug.Print MyString ' Print data to the Immediate window.
Loop
Close #1 ' Close file.
End Sub
This will be much faster than opening as a workbook
I have this function working greate handling lot of CSV files. You need to indicate in cell "D11" the name of folder containing all the CSV files and will combine them into one single file. I handle over 200 files and make it quick. Hope it helps.
Sub CombineAllFilesInADirectory()
Dim Path As String 'string variable to hold the path to look through
Dim FileName As String 'temporary filename string variable
Dim tWB As Workbook 'temporary workbook (each in directory)
Dim tWS As Worksheet 'temporary worksheet variable
Dim aWS As Worksheet 'active sheet in master workbook
Dim RowCount As Long 'Rows used on master sheet
Dim uRange As Range 'usedrange for each temporary sheet
Dim mWB_comb As Workbook 'master workbook exclusivo de esta funcion
Path = Sheets("CombineFiles").Range("D11").Value
Application.EnableEvents = False 'turn off events
Application.ScreenUpdating = False 'turn off screen updating
Set mWB_comb = Workbooks.Add(1) 'create a new one-worksheet workbook
Set aWS = mWB_comb.ActiveSheet 'set active sheet variable to only sheet in mWB
If Right(Path, 1) <> Application.PathSeparator Then 'if path doesnt end in "\"
Path = Path & Application.PathSeparator 'add "\"
End If
FileName = Dir(Path & "*.csv", vbNormal) 'set first file's name to filename variable
Application.StatusBar = "reading files, please wait."
Do Until FileName = "" 'loop until all files have been parsed
If Path <> ThisWorkbook.Path Or FileName <> ThisWorkbook.Name Then
Set tWB = Workbooks.Open(FileName:=Path & FileName) 'open file, set to tWB variable
For Each tWS In tWB.Worksheets 'loop through each sheet
Set uRange = tWS.Range("A4", tWS.Cells(tWS.UsedRange.Row + tWS.UsedRange.Rows.count - 1, _
tWS.UsedRange.Column + tWS.UsedRange.Columns.count - 1)) 'set used range
If RowCount + uRange.Rows.count > 65536 Then 'if the used range wont fit on the sheet
aWS.Columns.AutoFit 'autofit mostly-used worksheet's columns
Set aWS = mWB_comb.Sheets.Add(After:=aWS) 'add a new sheet that will accommodate data
RowCount = 0 'reset RowCount variable
End If
If RowCount = 0 Then 'if working with a new sheet
aWS.Range("A1", aWS.Cells(3, uRange.Columns.count)).Value = tWS.Range("A1", _
tWS.Cells(3, uRange.Columns.count)).Value 'copy headers from tWS
RowCount = 3 'add one to rowcount
End If
aWS.Range("A" & RowCount + 1).Resize(uRange.Rows.count, _
uRange.Columns.count).Value = uRange.Value 'move data from temp sheet to data sheet
RowCount = RowCount + uRange.Rows.count 'increase rowcount accordingly
Next 'tWS
tWB.Close False 'close temporary workbook without saving
End If
FileName = Dir() 'set next file's name to FileName variable
Loop
Application.StatusBar = "Ready"
mWB_comb.Sheets(1).Select 'select first data sheet on master workbook
Application.EnableEvents = True 're-enable events
Application.ScreenUpdating = True 'turn screen updating back on
'Clear memory of the object variables
Set tWB = Nothing
Set tWS = Nothing
Set mWB_comb = Nothing
Set aWS = Nothing
Set uRange = Nothing
End Sub