Finding Max & Min for varying lines in a list VBA excel - vba

I'm having trouble turning my thought process into tangible code and honestly I'm not sure where to start with the code. I have a data set with two applicable columns, for the sake of simplicity we'll say A and B. A contains a list of three initials followed by a number, ex. JFD3, JFD2, JFD6, EUW1, YMG2, YMG3. Column B has a value. I need to find the range of the highest to lowest values for each set of initials, which has me thinking a max - min solution. The list of initials isn't necessarily in order, and there could be one set of initials(with a net variance of 0, which is OK), or up to 8 sets of initials, with the numbers not necessarily being consecutive. I was thinking some sort of Match(Left(3)) but i don't think that would encompass everything.
Any ideas on where to start would be much appreciated. I'll be happy to clarify if theres any questions.

You can use dictionaries from the Scripting Runtime to do this easily. Use two of them with the initials as the keys, one holding the minimum values found and the other holding the maximum values found.
Add a reference to the Microsoft Scripting Runtime (Tools->Add reference..., then check the box next to "Microsoft Scripting Runtime") or late bind (see instructions below). Something like this should do the trick, assumes initials in column 1, values in column 2, no headers:
Private Sub MinMax()
Dim mins As Dictionary
Dim maxes As Dictionary
Dim sheet As Worksheet
Set sheet = ActiveSheet
Set mins = New Dictionary
Set maxes = New Dictionary
Dim row As Long
For row = 1 To sheet.UsedRange.Rows.Count
Dim key As Variant
Dim val As Integer
key = sheet.Cells(row, 1).Value2
If Len(key) >= 3 Then
key = Left$(sheet.Cells(row, 1).Value2, 3)
val = sheet.Cells(row, 2).Value2
If Not mins.Exists(key) Then
mins.Add key, val
Else
If mins(key) > val Then mins(key) = val
End If
If Not mins.Exists(key) Then
maxes.Add key, val
Else
If maxes(key) < val Then maxes(key) = val
End If
End If
Next row
For Each key In mins.Keys
Debug.Print key & ": Min = "; mins(key) & " Max = "; maxes(key)
Next key
End Sub
To use late binding, the code is exactly the same with these exceptions. Instead of declaring mins and maxes as Dictionary, declare them as Object:
Dim mins As Object
Dim maxes As Object
And instead of setting them as New Dictionary, use CreateObject:
Set sheet = ActiveSheet
Set mins = CreateObject("Scripting.Dictionary")
Set maxes = CreateObject("Scripting.Dictionary")

Use a Pivot Table. Put your Column A field* in the Row Labels, then put column B in the Values twice. Change one from Sum to Min, and the other from Sum to Max.
* Not sure if you need to group by JFD for all JFDx or by each JFDx. If you need them grouped by the 3 initials, make a column C =left("A1",3), then use that in your

An approach to this could be:
Sort the data in the range A-B by A in alphabetical order. To do this, you can record a macro while doing this action and edit the code to make it dynamically working every time. This is required to make the below solution work, more performing for many other kinds of similar approaches.
Use While blocks to run the solution. I let you take the time to build and test a working code, but this is the idea:
startSubset = 2 '<-- we start getting the key from row 2
'build the key to define the subset
keyStart = 1
currentKey = ""
Do While Not IsNumeric(Right(Left(Range("A" & startSubset),keyStart),1))
'while the last char of the key is not numeric, let's add it to the key
currentKey = currentKey & Right(Left(Range("A" & startSubset),keyStart),1)
keyStart = keyStart + 1
Loop
After the above, the key is stored in the variable currentKey. It will be JFD if the first cell is JFD213, etc. Hence, you loop until the end of this subset storing max and min in two variables:
min = 0
max = 0
Do While Left(Range("A" & startSubset),Len(currentKey)) = currentKey
If Range("B" & startSubset) < min Then min = Range("B" & startSubset)
If Range("B" & startSubset) > max Then max = Range("B" & startSubset)
startSubset = startSubset + 1
Loop
Once this is done, you just need to cast the values into a collection, for example:
myObs.Add(currentKey)
myObs.Add(min)
myObs.Add(max) '<-- you will get something like myObs = ("DJF", 0, 100)
Then cast this object into a bigger collection:
allValues.Add(myObs) '<-- at the end you will have something like this:
'allValues = [("DJF",0,100), ("ABC", 1, 75), ...]
and re-set the values to let them continue:
currentKey = ""
keyStart = 1
All the above, should be run in a While loop that will break when the data are over.
Please note the above code cannot work standing-alone, but it's rather a possible approach to the problem that you will need to re-work on your data to make it work in real life.

Related

Remove duplicate values and cells from one column

I have tried so many methods from the removeduplicates, selections and scripting dictionaries and I cannot get this to work. I do understand there are multiple ways to do this but if any of you can help, that would be great.
I have one list of values that I am pulling through from another sheet (up to approx 80k rows) into cell B13 downwards. I am then trying to remove the duplicate values and cells so I am left with unique values which I can then use to perform lookups on other sheets.
Sub Address_Sage()
Dim dataBook As Workbook
Dim dict As Object
Dim Sage_Data As Worksheet, Address As Worksheet
Dim dataSource As Range, dataDest As Range
Dim sourceDataRowCount As Integer, index As Integer
Dim rowCount As Long
Dim strVal As String
Set dataBook = Application.ThisWorkbook
Set sheetSource = dataBook.Sheets("Sage_Data")
Set sheetDest = dataBook.Sheets("Address")
Set dict = CreateObject("Scripting.Dictionary")
Set dataSource = sheetSource.Range("A3", _
sheetSource.Range("A90000").End(xlUp))
sourceDataRowCount = dataSource.Rows.Count
Set dataDest = sheetDest.Range("B13", "B" & _
sourceDataRowCount)
For index = 1 To sourceDataRowCount
dataDest(index, 1).Value = dataSource(index, 1).Value
Next index
Sheets("Address").Select
rowCount = ActiveSheet.Range("B13").CurrentRegion.Rows.Count
Do While rowCount > 0
strVal = Address.Cells(rowCount, 1).Value2
If dict.exists(strVal) Then
ActiveSheet.Rows(rowCount).EntireRow.Delete
Else
dict.Add strVal, 0
End If
rowCount = rowCount - 1
Loop
'Set dict = Nothing
End Sub
It always gets stuck on strVal line. I have tried changing value2 to value1 as I only have column but no luck.
thank you
Not super experienced in VBA so I can't speak to exactly what you're doing and what your code is saying but I thought I'd share this with you. Last week I had to create a macrobook that returned the unique entries of electrical defects that different crews observed while on the job. I made a dictionary that read all of the entries in the spreadsheet and then later printed all of the unique entries. I'll post the code and try to walk you through it.
If .Range("A" & i) <> "" Then
If dict.Exists(data) Then
dict(data) = dict(data) + 1
Else
dict.Add Key:=Data, Item:="1"
End If
End If
So the code basically says if column A (i is simply an incrementer) is not empty, then we're going to read the entries of column A. Data is simply a variable and you would set it equal to the range of values you'd like read in the dictionary. Obviously dictionary keys are unique and cannot repeat, so the code asks if the key already exists in the dictionary. If so, we will add one to it's count or value. And if not we will add that key to the dictionary. At the end of your loop, your dictionary will have stored all unique entries and the number of times they appeared.
Now we can reference them or print them.
For r = 0 To dict.Count
Sheets("Results").Range("B" & iResults) = dict.Keys(r)
Sheets("Results").Range("C" & iResults) = dict(dict.Keys(r))
Next
This second piece of code is a loop from 0 to the number of entries in your dictionary. It starts at zero because the dictionary is stored like an array and VBA arrays are base zero. The first statement will print the unique keys for every r until there are no more entries in the dictionary. The second statement will print the value or items associated with them. It will be an integer value equal to the number of times that unique entry showed up in your data.
You can use this same method for other purposes as well, not just printing the data but referencing it and using it somewhere else. But I am sure you will find that the For-loop with dict.Keys(r) is the easiest way to run through your dictionary entries. Took me a few days to figure it out and it revolutionized my program. Hope this helps you out.

Best way to populate an excel string column for fastest subsequent vba search (can I use metadata, etc?)

In a column with hundreds or even 1-2 thousand strings of approximately 40 characters, with one string per cell and many repeating entries, what is the best way to populate the column to conduct the fastest possible search later? The search should return a row number so that the corresponding row can be deleted.
Is there some way to append metadata or label to a cell/row for faster search? Is there some other mechanism that can identify cells that will make searching easier?
I'm new to VBA, and I want to set out on the best path before I get too far into the project and have to search through thousands of strings.
edit: Someone requested an example cell: The cells will have email addresses in them. I can control the email addresses on the server, so they will roughly be 40 characters long each. They will contain alphanumeric characters only.
Example of a fast way to implement a dictionary lookup
Data is on Sheet1, and starts in column A
The strings are in column B
Option Explicit
Public Sub SearchStrings()
Dim ur As Variant, r As Long, d As Object
Const COL_ID = 2
Set d = CreateObject("Scripting.Dictionary") 'or Reference to Microsof Scripting Runtime
d.CompareMode = TextCompare 'Case insensitive, or "BinaryCompare" otherwise
ur = Sheet1.UsedRange.Columns(COL_ID) 'read strings from column COL_ID into array
For r = LBound(ur) To UBound(ur) 'populate dictionary; Key = string (unique)
If Not IsError(ur(r, 1)) Then d(CStr(ur(r, 1))) = r 'Item = row id
Next
Debug.Print d.Keys()(3) 'prints the string in row 3
Debug.Print d.Items()(3) 'prints the row number of the 3rd string
End Sub
If you want to store string duplicates use this:
If Not IsError(ur(r, 1)) Then d(COL_ID & "-" & r) = CStr(ur(r, 1))
which is Key = Column ID & "-" & row ID (2-5), and Item = String itself

Is it possible to add cases to a Select Case based on the number of entries in a table?

I've been messing around with VBA in Excel a bit recently; and as a small project for myself, I'm trying to create a "draw names from a hat" sort of macro.
I began by generating a random number, and then choosing which entry from a Table (i.e. ListObject) would be selected using a case statement. The problem with this is that it only works of the number of Table entries is always the same.
So my question (probably a ridiculous one) is: is it possible at all to generate a dynamic 'Select Case' block, where the number of cases on the block is based on the number of entries in the Table?
Thanks.
-Sean
Edit: To clarify: what I am trying to do, exactly, is this:
I generate a random number, i, from 1 to n=10*(number of Table entries). After this, I want to display, in a cell, one of the table entries based on the random number.
Ideally, the code would work similarly to this:
if i = 1 to 10 then choose event 1
if i = 11 to 20 then choose event 2
if i = 21 to 30 then choose event 3
...
if i = (n-9) to n then choose event (n/10)
I hope this helps to clarify the goal of the code.
From our comments here is something you can use:
Sub random()
Dim used_rows As Integer
Dim random As Integer
Dim cell_array() As Integer
used_rows = Sheet1.UsedRange.Rows.Count
ReDim cell_array(used_rows)
For i = 1 To used_rows
cell_array(i - 1) = Cells(i, 1)
Next
random = Int(Rnd * (used_rows))
MsgBox cell_array(random)
End Sub
You can go ahead and change MsgBox to whatever you like, or set like Cell(1,4).Value = cell_array(random), or however you'd like to proceed. It will be based off the number of rows used. Though depending on how you implement your spreadsheet the code might have to be changed a bit.
Here's the update code from the suggestions from the comments. Also remember to use Randomize() in your form initialization or WorkBook Open functions.
Sub random()
Dim used_rows As Integer
Dim random As Integer
'Multiple ways to get the row count, this is just a simple one which will work for most implementations
used_rows = Sheet1.UsedRange.Rows.Count
random = Int(Rnd * (used_rows))
'I use the variable only for the reason that you might want to reference it later
MsgBox Cells(random, 1)
End Sub
This assumes that by "table" you mean "Table with a capital T", known in VBA as a ListObject:
Sub PickRandomTens()
Dim lo As Excel.ListObject
Dim ListRowsCount As Long
Dim RandomNumber As Long
Dim ListEvent As String
Dim Tens As Long
Set lo = ActiveSheet.ListObjects(1)
ListRowsCount = lo.DataBodyRange.Rows.Count
RandomNumber = Application.WorksheetFunction.RandBetween(10, ListRowsCount * 10)
ListEvent = lo.ListColumns("Data Column").DataBodyRange.Cells(Int(RandomNumber / 10))
MsgBox "Random number: " & RandomNumber & vbCrLf & _
"Event: " & ListEvent
End Sub

EXCEL VBA- Average all rows containing numerical values for each column in a Merged Area

I have multiple spreadsheets that each roughly look like this:
I'm trying to find a way to go through each of the SPEAKER HEADERS in Row 1, and summarize the scores that are associated with the corresponding survey question ("Was the CONTENT good? Was the SPEAKER relevant? What the DELIVERY good?) grouped by color.
I can't think of a clever way of doing this automatically.
I can get the RANGE SPANS of the Merged Cells like this:
For Each Cell In src_sheet.UsedRange.Cells
If Cell.Row = 1 And IsEmpty(Cell) = False Then
MsgBox Cell.MergeArea.Address
End If
Next
I then need to iterate over the range provided by the address, getting the numerical values in all the rows BELOW that range.
For example, running the current macro produces this:
I need to take $C$1:$E$1 and run a for loop that say FROM C1 to E1 average all the numbers in the rows below it. I have no idea how to do this.
I was thinking about augmenting the selection in include everything used
Is there a better way to do this?
This is the tragically bad way I'm doing it now (which I'm quite proud of on account of being new to excel):
For Each Cell In src_sheet.UsedRange.Cells
If Cell.Row = 1 And IsEmpty(Cell) = False Then
Set rng = Range(Cell.MergeArea.Address) 'Equal to the Address of the Merged Area
startLetter = Mid(rng.Address, 2, 1) 'Gets letter from MergeArea Address
endLetter = Mid(rng.Address, 7, 1) 'Gets letter from MergeArea Address
On Error GoTo ErrHandler:
Set superRange = Range(startLetter & ":" & endLetter)
ErrHandler:
endLetter = startLetter
Set superRange = Range(startLetter & ":" & endLetter)
Resume Next
superRange.Select
MsgBox Application.Average(Selection)
In order to get rid of the error you are having, you need to change:
Set rng = Cell.MergeArea.Address
to
Set rng = Range(Cell.MergeArea.Address)
Ideally, this data would be better stored in a database so that it could be queried easily. If that's not an option, then the way you are going at it in Excel is as valid as most any other approach.
EDIT
Once you obtain the address of the left-most column for each of your speakers, you can loop through each column to obtain averages.
'Number of columns in the current speaker's range.
numColumns = rng.Columns.Count
'First row containing data.
currentRow = 4
'First column containing data.
firstColumn = rng.Column
'Loop through each column.
For col = firstColumn to firstColumn + (numColumns -1)
totalValue = 0
'Loop through each row.
Do While Cells(currentRow,col).value <> ""
totalValue = totalValue + Cells(currentRow,col).Value
currentRow = currentRow + 1
Loop
averageValue = totalValue / (currentRow - 3)
'Reset the currentRow value to the top of the data area.
currentRow = 4
'Do something with this average value before moving on to the next column.
Next
If you don't know what row is the start of your data, you can keep checking every row below rng.Row until you hit a numeric value.
The method above assumes that you have no blank entries in your data area. If you have blank entries, then you should either sort the data prior to running this code, or you would need to know how many rows you must check for data values.

Creating an Excel Macro to delete rows if a column value repeats consecutively less than 3 times

The data I have can be simplified to this:
http://i.imgur.com/mn5GgrQ.png
In this example, I would like to delete the data associated with track 2, since it has only 3 frames associated with it. All data with more than 3 associated frames can stay.
The frame number does not always start from 1, as I've tried to demonstrate. The track number will always be the same number consecutively for as many frames as are tracked. I was thinking of using a function to append 1 to a variable for every consecutive value in column A, then performing a test to see if this value is equal >= 3. If so, then go onto the next integer in A, if no, then delete all rows marked with that integer (2, in this case).
Is this possible with Visual Basic in an Excel Macro, and can anyone give me some starting tips on what functions I might be able to use? Complete novice here. I haven't found anything similar for VBA, only for R.
I assume you understand the code by reading it.
Option Explicit
Public Function GetCountOfRowsForEachTrack(ByVal sourceColumn As Range) As _
Scripting.Dictionary
Dim cell As Range
Dim trackValue As String
Dim groupedData As Scripting.Dictionary
Set groupedData = New Scripting.Dictionary
For Each cell In sourceColumn
trackValue = cell.Value
If groupedData.Exists(trackValue) Then
groupedData(trackValue) = cell.Address(False, False) + "," + groupedData(trackValue)
Else
groupedData(trackValue) = cell.Address(False, False)
End If
Next
Set GetCountOfRowsForEachTrack = groupedData
End Function
Public Sub DeleteRowsWhereTrackLTE3()
Dim groupedData As Scripting.Dictionary
Set groupedData = GetCountOfRowsForEachTrack(Range("A2:A15"))
Dim cellsToBeDeleted As String
Dim item
For Each item In groupedData.Items
If UBound(Split(item, ",")) <= 2 Then
cellsToBeDeleted = item + IIf(cellsToBeDeleted <> "", "," + cellsToBeDeleted, "")
End If
Next
Range(cellsToBeDeleted).EntireRow.Delete
End Sub
GetCountOfRowsForEachTrack is a function returning a dictionary (which stores track number as key, cell address associated with that track as string)
DeleteRowsWhereTrackLTE3 is the procedure which uses GetCountOfRowsForEachTrack to get the aggregated info of Track numbers and cells associated with it. This method loops through the dictionary and checks if the number of cells associated with track is <=2 (because splitting the string returns an array which starts from 0). It builds a string of address of such cells and deletes it all at once towards the end.
Note:
Add the following code in a bas module (or a specific sheet where
you have the data).
Add reference to "Microsoft Scripting.Runtime" library. Inside VBA, click on "Tools" -> "References" menu. Tick the "Microsoft Scripting.Runtime" and click on OK.
I have used A2:A15 as an example. Please modify it as per your cell range.
The assumption is that you don't have thousands of cells to be deleted, in which case the method could fail.
Make a call to DeleteRowsWhereTrackLTE3 to remove such rows.