Gather data tidy in Excel using VBA - vba

What the case is:
So I got a "results sample" in excel format that needs filtering and reshaping to look nice. It is a result that will be not identical all the time but it follows similar rules. I have to filter it further and make it a little more tidy. I have figured out the filtering part, but I am not sure how to sort the remaining data, in a tidy way.
What the situation is:
There are six columns involved.
Notice: Real deal is not THAT simple, but what I need can be demonstrated using such a simple example and then I can manage more complex stuff myself I suppose.
For our example we use columns from B to G
The data are set as pairs of a "title" and a value.
For instance, if you look the first example picture I provide, The first detais the pair B3 and C3.
As you can see, looking at the same picture, D3 and E3 is an empty pair.
Same goes for D4 - E4 and F4 - G4 and so on until a last one at B11 - C11.
Starting data example:
[
What I want to achieve:
I would like, using Visual Basic for Applications, to sort the data, starting from let's say for our example B3 (see second picture) and fill three SETS of two columns, (BC, DE, FG) if there are no data inside those cells.
Notice: If a cell like D3 is null then SURELY E3 will be null too so there can be just only one check. I mean we can check either value columns or title columns.
Notice2: The B,D,F or C,E,G columns DON'T have to be sorted. I just want all the not-null values of B,D,F and their respective values from C,E,G gathered together neat so printing will not need 30 pages but just a few (too many spaces between is causing it and I try to automate the cleanup)

Here's something to start with. The first double loop populates a VBA Collection with Range variables that refer to the Cells that contain the titles.
The associated values are obtained by using an offset. The middle double loop performs a bubble sort on the latter (highly inefficient - you might want to replace it with something else). The next if statement creates a 2nd sheet if it doesn't exist on which to write out the results (last loop).
Option Explicit
Sub GatherData()
Dim lastRow As Integer, lastCol As Integer
Dim r As Integer, c As Integer
Dim vals As Collection
Set vals = New Collection
With Sheets(1)
lastCol = .UsedRange.Columns(.UsedRange.Columns.Count).Column
lastRow = .UsedRange.Rows(.UsedRange.Rows.Count).row
For c = 1 To lastCol Step 2
For r = 1 To lastRow
If (Trim(Cells(r, c).Value) <> "") Then
vals.Add .Cells(r, c)
End If
Next
Next
End With
' Bubble Sort
Dim i As Integer, j As Integer
Dim vTemp As Range
For i = 1 To vals.Count - 1
For j = i + 1 To vals.Count
If vals(i).Value > vals(j).Value Then
Set vTemp = vals(j)
vals.Remove j
vals.Add vTemp, vTemp, i
End If
Next j
Next i
Dim sht2 As Worksheet
If ThisWorkbook.Worksheets.Count = 1 Then
Set sht2 = ThisWorkbook.Sheets.Add(After:=ThisWorkbook.Sheets(1))
Else
Set sht2 = Worksheets(2)
End If
With sht2
r = 3
c = 2
For i = 1 To vals.Count
.Cells(r, c).Value = vals(i).Value
.Cells(r, c + 1).Value = vals(i).Offset(, 1).Value
c = c + 2
If c = 8 Then
r = r + 1
c = 2
End If
Next
End With
End Sub

Here is a method using the Dictionary object. I use early binding which requires setting a reference to Microsoft Scripting Runtime. If you are going to be distributing this, you might want to convert this to late-binding.
We assume that your data is properly formed as you show it above. In other words, all the titles are in even numbered columns; and the results are in the adjacent cell.
We create the dictionary using the Title as the Key, and the adjacent cell value for the Dictionary item.
We collect the information
Transfer the Keys to a VBA array and sort alphabetically
create a "Results Array" and populate it in order
write the results to a worksheet.
I will leave formatting and header generation to you.
By the way, there is a constant in the code for the number of Title/Value pair columns. I have set it to 3, but you can vary that.
Enjoy
Option Explicit
Option Compare Text 'If you want the sorting to be case INsensitive
'set reference to Microsoft Scripting Runtime
Sub TidyData()
'Assume Titles are in even numbered columns
'Assume want ColPairs pairs of columns for output
'Use dictionary with Title as key, and Value as the item
Dim dctTidy As Dictionary
Dim arrKeys As Variant
Dim wsSrc As Worksheet, wsRes As Worksheet, rRes As Range
Dim vSrc As Variant, vRes As Variant
Dim LastRow As Long, LastCol As Long
Dim I As Long, J As Long, K As Long, L As Long
Dim V As Variant
'in Results
Const ColPairs As Long = 3
'Set Source and results worksheet and range
Set wsSrc = Worksheets("sheet1")
Set wsRes = Worksheets("sheet2")
Set rRes = wsRes.Cells(1, 2)
'Read source data into variant array
With wsSrc.Cells
LastRow = .Find(what:="*", after:=.Item(1, 1), _
LookIn:=xlValues, searchorder:=xlByRows, searchdirection:=xlPrevious).Row
LastCol = .Find(what:="*", after:=.Item(1, 1), _
LookIn:=xlValues, searchorder:=xlByColumns, searchdirection:=xlPrevious).Column
vSrc = .Range(.Cells(1, 1), .Cells(LastRow, LastCol))
End With
'Collect the data into a dictionary
Set dctTidy = New Dictionary
For I = 1 To UBound(vSrc, 1)
For J = 2 To UBound(vSrc, 2) Step 2
If vSrc(I, J) <> "" Then _
dctTidy.Add Key:=vSrc(I, J), Item:=vSrc(I, J + 1)
Next J
Next I
'For this purpose, we can do a simple sort on the dictionary keys,
' and then create our results array in the sorted order.
arrKeys = dctTidy.Keys
Quick_Sort arrKeys, LBound(arrKeys), UBound(arrKeys)
'Create results array
ReDim vRes(1 To WorksheetFunction.RoundUp(dctTidy.Count / ColPairs, 0), 1 To ColPairs * 2)
I = 0
J = 0
For Each V In arrKeys
K = Int(I / ColPairs) + 1
L = (J Mod ColPairs) * 2 + 1
vRes(K, L) = V
vRes(K, L + 1) = dctTidy(V)
I = I + 1
J = J + 1
Next V
'write the results
Set rRes = rRes.Resize(UBound(vRes, 1), UBound(vRes, 2))
With rRes
.Worksheet.Cells.Clear
.Value = vRes
.HorizontalAlignment = xlCenter
End With
End Sub
Sub Quick_Sort(ByRef SortArray As Variant, ByVal first As Long, ByVal last As Long)
Dim Low As Long, High As Long
Dim Temp As Variant, List_Separator As Variant
Low = first
High = last
List_Separator = SortArray((first + last) / 2)
Do
Do While (SortArray(Low) < List_Separator)
Low = Low + 1
Loop
Do While (SortArray(High) > List_Separator)
High = High - 1
Loop
If (Low <= High) Then
Temp = SortArray(Low)
SortArray(Low) = SortArray(High)
SortArray(High) = Temp
Low = Low + 1
High = High - 1
End If
Loop While (Low <= High)
If (first < High) Then Quick_Sort SortArray, first, High
If (Low < last) Then Quick_Sort SortArray, Low, last
End Sub

Assuming we got all variables set and initialized properly, in this example:
Sheets("sheetname").Select ' because stupid things can happen...
For i = 3 To 13
Let newrangeT = "B" & i '
Let newrangeV = "C" & i '
If Sheets("sheetname").Range(newrangeV) <> "" Then
values(Position) = Sheets("sheetname").Range(newrangeV)
titles(Position) = Sheets("sheetname").Range(newrangeT)
Position = Position + 1
Else
' Don't do anything if the fields are null
End If
Next i
Sheets("sheetname").Range("B1:G13").Clear
' We then get each data from the arrays with a For loop.
' We set a columnset variable to 1.
' We set a currentrow variable to 3.
' If columnset is 1 data will enter in B and C and columnset = columnset +1
' Then if columnset is 2 we set data to DE and columnset = columnset +1
' But if columnset is 2we set data to FG and columnset = 1 and currentrow = currentrow +1
' Iterating the arrays will result in a neat setting of the data, but it will add zeros for all the nulls. Thus we need an If statement that will exclude that values checking the TITLE array (that should contain a title instead). if the value is not 0 then... we run what I describe, otherwise we do nothing.
Putting the data in the array is half of the trick.
Then we clear the area.
We set two string variables to declare ranges (actually cell reference) for every cell iterated in the loop. Here I demonstrated only for column set B,C
but we have to do the same for the rest of the columns.
The If statement here checks for null. You might have different needs, so changing the if statement changes the filtering. Here I check if the cells are not null. If the cells of column C contain data, put those data in values array and the respective B data on titles array but where? Position starts as 1 and we then iterate it +1 each time it adds something.
You can set data from an array using this command:
' current_row is set to the first row of the spreadsheet we wanna fill.
Sheets("sheetname").Select ' because stupid things can happen...
newrangeV = "C" & current_row
Sheets("sheetname").Range(newrangeV) = values(j)
The rest is just putting things together.
In any case, I wanna thank both of the people involved in this question, because I might didn't got the solution, but I got an idea of how to do other stuff, like accidentally learning something new. Cheers.

Related

How can I reverse this offset property?

I have data that I'm moving into a template, and I'm trying to repurpose my vba script I used. Before, the data needed to be transposed into a specific range, but now I want to just move it over without the need to transpose it.
'Write the employee data into the template
a = 0
For k = 2 To UBound(Data, 2)
Dest.Offset(a, j) = Data(i, k)
a = a + 1
Next
I assume the dest.offset property is what is causing the transposition, how would I change this to just move the array normally without the transpose?
Rest of script:
Option Explicit
Sub Main()
Dim Wb As Workbook
Dim Data, Last
Dim i As Long, j As Long, k As Long, a As Long
Dim Dest As Range
'Refer to the template
Set Wb = Workbooks("ValidationTemplate.xlsx")
'Refer to the destination cell
Set Dest = Wb.Sheets("Employee Core Skills").Range("B3")
'Read in all data
With ThisWorkbook.Sheets("Sheet1")
Data = .Range("DO2", .Range("A" & Rows.Count).End(xlUp))
End With
Wb.Activate
Application.ScreenUpdating = False
'Process the data
For i = 1 To UBound(Data)
'Manager changes?
If Data(i, 1) <> Last Then
'Skip the first
If i > 1 Then
'Scroll into the view
Dest.Select
'Save a copy
Wb.SaveCopyAs ThisWorkbook.Path & Application.PathSeparator & _
ValidFileName(Last & "_Assessment.xlsx")
End If
'Clear the employees
Dest.Resize(, Columns.Count - Dest.Column).EntireColumn.ClearContents
'Remember this manager
Last = Data(i, 1)
'Start the next round
j = 0
End If
'Write the employee data into the template
a = 0
For k = 2 To UBound(Data, 2)
Dest.Offset(a, j) = Data(i, k)
a = a + 1
Next
End If
'Next column
j = j + 1
Next
End Sub
If I understand your question correctly...
Dest.Offset(a, j)
Means use the Range referred to by the Range Object called Dest, then offset from there a rows (positive would be down the spreadsheet, negative up the spreadsheet) and j columns (positive to the right, negative to the left).
If you just want to put the data in the Range pointed to by Dest, then simply omit the .Offset() portion like this:
Dest.value2 = Data(i,k).value2
Note: .Value is the default member referred to when you leave it off by referring to just Dest. It's always best to specify and not leave it up to VBA to figure out what you mean. Why use .Value2 instead of just .Value? Read this SO question and the accepted answer.
The transposition is happening here because of the order of a and j.
a = 0
For k = 2 To UBound(Data, 2)
Dest.Offset(a, j) = Data(i, k)
a = a + 1
Next
End If
'Next column
j = j + 1
It's really hard to tell because of the unclear variable names.
If you rename your variables like this:
Dim I as Long --> Dim sourceRow as Long
Dim K as Long --> Dim sourceCol as Long
Dim A as Long --> Dim destRow as Long
Dim J as Long --> Dim destCol as Long
You'll see that's the way they're being currently being used and that what you want to do is swap destRow with destCol.
Rewriting that code with the new variable names gives you:
destRow = 0
For sourceCol = 2 To UBound(Data, 2)
Dest.Offset(destRow, destCol) = Data(sourceRow, sourceCol)
destRow = destRow + 1
Next
End If
'Next column
destCol = destCol + 1
and now you can much more easily see that your loop is incrementing your sourceCol and your destRow. If you now change that to:
destRow = 0
For sourceCol = 2 To UBound(Data, 2)
Dest.Offset(destRow, destCol).Value2 = Data(sourceRow, sourceCol).Value2
destCol = destCol + 1
Next
End If
'Next column
destRow = destRow + 1
You'll see that the loop is now incrementing both source & dest columns. Now you just need to change the incrementers in the outer loop to update the rows in sync and you should be good.
This is a great object lesson in why good names for code "things" are incredibly valuable. Once I sorted out a, i, j & k, it made it very obvious. It appears you're not the original author of this code, but even if you are, it's OK. Most of us start out with horrible names for stuff then slowly learn over time how valuable good names are. It's well worth it to refactor the code to improve these names and others.
A quick shameless plug for the Rubberduck plugin for the VBE. I'm a huge fan and starting to contribute to the project, as well. It will allow you to refactor your code by intelligently renaming variables. As you can imagine, doing a search & replace for i to sourceRow wsourceRowll gsourceRowve you some sersourceRowously broken code! Rubberduck will avoid that problem and add many, many more features that you'll soon wonder how you ever lived without!

Loop not correctly deleting rows in the Worksheet

So here is the task I am required to do.
I have a worksheet in which the user can specify a Column name and an element under the column, once chosen, the macro will find and delete every element with said name.
My issue comes from the final part of the macro, the delete. My loop doesn't delete all the rows, it will only find one instance of the element and delete it, then go to the next element and delete it, leaving every other element with the same name intact.
Here is the function within the macro, I apologize in advance for the poor code quality as I am not well versed in vba.
Function LineDelete() As Variant
Dim NbLignes As Integer
Dim ctr As Integer
Dim ctr2 As Integer
Dim Table As Variant
Worksheets("parametrage_suppr_ligne").Activate
ctr = 1
ctr2 = 1
NbLignes = Cells.Find("*", Range("A1"), , , xlByRows, xlPrevious).Row - 4
ReDim Table(1 To NbLignes, 2)
While ctr <= NbLignes
Table(ctr, 1) = Cells(ctr + 4, 1).Value
Table(ctr2, 2) = Cells(ctr2 + 4, 2).Value
ctr = ctr + 1
ctr2 = ctr2 + 1
Wend
Call FileOpen
Call delInvalidChars
Call OrderRows
Dim newCtr As Integer
Dim rng As Range
Dim rngHeaders As Range
Dim newString As Variant
Dim i As Integer
NbLignes = 0
NbLignes = Cells.Find("*", Range("A1"), , , xlByRows, xlPrevious).Row
Set rngHeaders = Range("1:1")
newCtr = 1
For i = NbLignes To FirstRow Step -1
Set rng = rngHeaders.Find(Table(newCtr, 1))
If Table(newCtr, 1) = rng Then
MsgBox "All is gud!!"
newString = Cells.Find(Table(newCtr, 2))
If Table(newCtr, 2) = newString Then
MsgBox newString
Range(Cells.Find(Table(newCtr, 2)).Address).EntireRow.Delete
newCtr= newCtr + 1
End If
End If
newCtr = newCtr + 1
Next i
End Function
So now to explain a bit what I've done here.
At first I store the options in a 2 dimentional table with a simple loop, in this table I store the name of the column a well as the name of the element under the column that has to be deleted.
After that I call the functions which open a txt file which is then transformed into an excel file, it is in this new excel file that the deletes have to be done.
I then reset the NbLignes variable as well as call new variables.
Here is where the issues begin, I thought that by iterating on the number of lines the new excel file has; the program was going to look for all of the instances of the word in the column and was going to delete them, but so far it will only do it 3 times.
I am totally lost as to what modify to be able to fix this.
Here is what the config table looks like, this is what the user can modify to specify what to delete, it is also what I store inside of the 2d Table:
User can add as many columns and names as needed
EDIT: What the code does now after updating is that it deletes all the elements that have the same name as the first one in the image (fun_h_opcomp), the expected outcome would be that as soon as all those elements are deleted, the program should then pass on to the next one (fun_b_pterm) and so on.
Of course the i was just an Example for that counter and you must use your newCtr counter here, and FirstRow must be set to a value.
Const FirstRow As Long = 1
Dim newCtr As Long 'always use Long for row counting
For newCtr = NbLignes To FirstRow Step -1
Set rng = rngHeaders.Find(Table(newCtr, 1))
If Table(newCtr, 1) = rng Then
MsgBox "All is gud!!"
newString = Cells.Find(Table(newCtr, 2))
If Table(newCtr, 2) = newString Then
MsgBox newString
Range(Cells.Find(Table(newCtr, 2)).Address).EntireRow.Delete
End If
End If
Next newCtr
There is no need to increment/decrement newCtr anymore because this is automatically done by the Next statement.

Excel VBA looping: reshape column into table

I have not used VBA for sometime so am very rusty... What I have is a number of records stored vertically (in a single column) and I want to use VBA to stack them side by side (into a table).
My general thoughts about how this would flow:
Start at first range
Copy data
Paste data in cell B3 of output page (just named Sheet2)
Loop back to previous range and offset by 51 rows
Copy data
Paste data in cell C3 of output page (offset by 1 column each time)
My attempt so far:
Sub Macro1()
FiftyOne = 51 ' Offset by 51 rows for every chunk
StartRange = "L262:L303" ' Start at this range of data to copy, each chunk is identical in size
OutputRange = B3 ' Paste in output at B3, but need to offset by one column each time
Range(StartRange).Offset(FiftyOne, 0).Select
Selection.Copy
Sheets("Sheet2").Select
Range("B3").Offset(0, 1).Select
ActiveSheet.Paste
End Sub
I know this is a rather lame attempt to tackle this flow, but I am really struggling with how to loop through this. I would appreciate some advice on how to do this, or a better approach to the general flow.
Edit after accepting Wolfie's answer:
I want to assign column headings, by getting the values from C258 and looping down (in a similar way to before) 51 rows at a time, to paste into row 2 of sheet2 (B2, C2, ...).
Here is my current attempt:
Sub NameToTable()
' Assign first block to range, using easily changable parameters
' Remember to "Dim" all of your variables, using colon for line continuation
Dim blocksize As Long: blocksize = 51
Dim firstrow As Long: firstrow = 258
Dim rng As Range
Set rng = ThisWorkbook.Sheets("Sheet1").Range("C" & firstrow & blocksize - 1)
' tablestart is the upper left corner of the "pasted" table
Dim tablestart As Range: Set tablestart = ThisWorkbook.Sheets("Sheet2").Range("B2")
Dim i As Long ' Looping variable i
Dim nblocks As Long: nblocks = 10 ' We're going to loop nblocks number of times
For i = 0 To nblocks - 1
' Do the actual value copying, using Resize to set the number of rows
' and using Offset to move down the original values and along the "pasted" columns
tablestart.Offset(0, i).Resize(blocksize, 1).Value = _
rng.Offset(blocksize * i, 0).Value
Next i
End Sub
Your logic seems alright, this code will create a 51 x n table, lining up each vertical block of 51 cells in its own column.
Note, it's much quicker to assign the .Value than copying and pasting, if you need formats too then you could copy/paste or similarly set format properties equal.
Sub ColumnToTable()
' Assign first block to range, using easily changable parameters
' Remember to "Dim" all of your variables, using colon for line continuation
Dim blocksize As Long: blocksize = 51
Dim firstrow As Long: firstrow = 262
Dim rng As Range
Set rng = ThisWorkbook.Sheets("Sheet1").Range("L" & firstrow & ":L" & firstrow + blocksize - 1)
' tablestart is the upper left corner of the "pasted" table
Dim tablestart As Range: Set tablestart = ThisWorkbook.Sheets("Sheet2").Range("B3")
Dim i As Long ' Looping variable i
Dim nblocks As Long: nblocks = 10 ' We're going to loop nblocks number of times
For i = 0 To nblocks - 1
' Do the actual value copying, using Resize to set the number of rows
' and using Offset to move down the original values and along the "pasted" columns
tablestart.Offset(0, i).Resize(blocksize, 1).Value = _
rng.Offset(blocksize * i, 0).Value
Next i
End Sub
Set the nblocks value to suit your needs, this is the number of resulting columns in your output table. You could get it dynamically by knowing the number of rows in the original column. Or you could use some while logic, careful to make sure that it does eventually exit of course!
Dim i As Long: i = 0
Do While rng.Offset(blocksize*i, 0).Cells(1).Value <> ""
tablestart.Offset(0, i).Resize(blocksize, 1).Value = rng.Offset(blocksize * i, 0).Value
i = i + 1
Loop
Edit: to get your column headings, keep in mind that the column headings are only 1 cell, so:
' Change this:
Set rng = ThisWorkbook.Sheets("Sheet1").Range("C" & firstrow & blocksize - 1)
' To this:
Set rng = ThisWorkbook.Sheets("Sheet1").Range("C" & firstrow)
Tip: + is used for adding numerical values, whilst & is used for concatenating stings.
Now when you're looping, you don't need the Resize, because you are only assigning 1 cell's value to 1 other cell. Resulting sub:
Sub NameToTable()
Dim blocksize As Long: blocksize = 51
Dim firstrow As Long: firstrow = 258
Dim rng As Range
Set rng = ThisWorkbook.Sheets("Sheet1").Range("C" & firstrow)
Dim tablestart As Range: Set tablestart = ThisWorkbook.Sheets("Sheet2").Range("B2")
Dim i As Long: i = 0
Do While rng.Offset(blocksize*i, 0).Value <> ""
tablestart.Offset(0, i).Value = rng.Offset(blocksize * i, 0).Value
i = i + 1
Loop
End Sub
When dealing with your worksheets in excel, each time you reference them adds overhead and slows down the code, what you want to do is take all of the info off your spreadsheet into an array then use Application.Transpose to transpose it for you.
You can then use 'Resize' to make sure your destination range is the same size and set the values.
Sub CopyAndTransRange(src As Range, dest As Range)
Dim arr As Variant 'Needs to be a variant to take cell values
arr = Application.Transpose(src.Value) 'Set to array of values
On Error GoTo eh1dim 'Capture error from vertical 1D range
dest.Resize( _
UBound(arr, 1) - LBound(arr, 1) + 1, _
UBound(arr, 2) - LBound(arr, 2) + 1 _
) = arr 'Set destination to array
Exit Sub
eh1dim:
dest.Resize( _
1, _
UBound(arr) - LBound(arr) + 1 _
) = arr 'Set row to 1D array
End Sub
Note, Application.Transpose will fall over with some arrays in weird circumstances like if there is more than 255 characters in a string in the given array, for those situations you can write your own Transpose function to flip the array for you.
Edit:
When you feed a vertical 1-dimensional range and transpose it, VBA converts it to a 1-dimensional array, I've rewritten so it captures the error when this happens then adjusts accordingly.
Just made this example which has values 1 through 7 populated on the first 7 rows of column A. This code effectively loops through each of the values, and transposes horizontally so all values are on a single row (1).
Dim rng As Range
Dim crng As Range
Static value As Integer
Set rng = ActiveSheet.Range("A1", Range("A1").End(xlDown))
For Each crng In rng.Cells
ActiveSheet.Range("A1").Offset(0, value).value = crng.value
If value <> 0 Then
crng.value = ""
End If
value = value + 1
Next crng
First we grab the required range and then iterate through each cell. Then using the offset method and an incrementing integer, we can assign their values horizontally to a single row.
It's worth noting that this would work when trying to transpose both vertically and horizontally. The key is the offset(column, row).
Just adjust where you place your incrementing Integer.
Hope this helps.

COUNTIF() in 'For' loop

I have a column with nearly 100k and am trying to determine how many times a value occurs repeatedly in that column. I can do it row by row currently, but this is menial as a programmer, through something like =COUNTIF(D:D,D2). Yet that only returns D2 matches in column D.
I need to iterate through all values of D returning countif, therefore revealing all of the values repetitions in the column. I can remove duplicates later! So I have a dev. button a basic sub, or function (man this is new to me) and something along the lines of the most basic for loop ever. Just getting caught up on how to implement the COUNTIF() to to the loop properly.
Right now I'm looking at:
Sub doloop()
Dim i As Integer
i = 1
Do While i < D.Length
Cells(i, 8).Value =CountIf(D:D,D[i])
i = i + 1
Loop
End Sub
That code is incorrect obviously but it is where I'm at and may help for anyone more familiar with other languages.
Use Application.WorksheetFunction.CountIf() in your loop.
Private Sub doloop()
Dim lastRow As Long
Dim d As Double
Dim r As Range
Dim WS As Excel.Worksheet
Dim strValue As String
Dim lRow As Long
'Build your worksheet object
Set WS = ActiveWorkbook.Sheets("sheet1")
'Get the last used row in column A
lastRow = WS.Cells(WS.Rows.count, "D").End(xlUp).Row
'Build your range object to be searched
Set r = WS.Range("D1:D" & lastRow)
lRow = 1
WS.Activate
'Loop through the rows and do the search
Do While lRow <= lastRow
'First, get the value we will search for from the current row
strValue = WS.Range("D" & lRow).Value
'Return the count from the CountIf() worksheet function
d = Application.worksheetFunction.CountIf(r, strValue)
'Write that value to the current row
WS.Range("H" & lRow).Value = d
lRow = lRow + 1
Loop
End Sub
I believe you are trying to write the value to the cell, that is what the above does. FYI, if you want to put a formula into the cell, here is how that is done. Use this in place of WS.Range("H" & lRow).Value = d
WS.Range("H" & lRow).Formula = "=CountIf(D:D, D" & lRow & ")"
Sounds like you may want to look into using tables in Excel and capitalizing on their features like filtering and equation autofill. You may also be interested in using a PivotTable to do something very similar to what you're describing.
If you really want to go about this the programmatic way, I think the solution Matt gives answers your question about how to do this using CountIf. There's a big detriment to using CountIf though, in that it's not very computationally efficient. I don't think the code Matt posted will really be practical for processing the 100K rows mentioned in the OP (Application.ScreenUpdating = false would help some). Here's an alternative method that's a lot more efficient, but less intuitive, so you'll have to decide what suites your needs and what you feel conformable with.
Sub CountOccurances()
'Define Input and Output Ranges
'The best way to do this may very from case to case,
'So it should be addressed seperately
'Right now we'll assume current sheet rows 1-100K as OP specifies
Dim RInput, ROutput As Range
Set RInput = Range("D1:D100000")
Set ROutput = Range("E1:E100000")
'Define array for housing and processing range values
Dim A() As Variant
ReDim A(1 To RInput.Rows.Count, 0)
'Use Value2 as quicker more accurate value
A = RInput.Value2
'Create dictionary object
Set d = CreateObject("Scripting.Dictionary")
'Loop through array, adding new values and counting values as you go
For i = 1 To UBound(A)
If d.Exists(A(i, 1)) Then
d(A(i, 1)) = d(A(i, 1)) + 1
Else
d.Add A(i, 1), 1
End If
Next
'Overwrite original array values with count of that value
For i = 1 To UBound(A)
A(i, 1) = d(A(i, 1))
Next
'Write resulting array to output range
ROutput = A
End Sub
You can also modify this to include the removal of replicates you mentioned.
Sub CountOccurances_PrintOnce()
'Define Input and Output Ranges
'The best way to do this may very from case to case,
'So it should be addressed seperately
'Right now we'll assume current sheet rows 1-100K as OP specifies
Dim RInput, ROutput As Range
Set RInput = Range("D1:D100000")
Set ROutput = Range("F1:F9")
'Define array for housing and processing range values
Dim A() As Variant
ReDim A(1 To RInput.Rows.Count, 0)
'Use Value2 as quicker more accurate value
A = RInput.Value2
'Create dictionary object
Set d = CreateObject("Scripting.Dictionary")
'Loop through array, adding new values and counting values as you go
For i = 1 To UBound(A)
If d.Exists(A(i, 1)) Then
d(A(i, 1)) = d(A(i, 1)) + 1
Else
d.Add A(i, 1), 1
End If
Next
'Print results to VBA's immediate window
Dim sum As Double
For Each K In d.Keys
Debug.Print K & ": " & d(K)
sum = sum + d(K)
Next
Debug.Print "Total: " & sum
End Sub

Transfer Data by matching Header

I have a Backend raw output data set, which consists of multiple columns with some of them being empty except of the header.
I would like to transfer this data into another worksheet, let's call it Backend - processed. In this worksheet, I would prepare a header row, which consists of some of the headers included in the original data set. There won't be any new headers in the processed worksheet (so basically headers(processed) is a subset of headers(raw output)).
Once, I used to solve this problem with a function (Index & Match), but with growing raw data sets, this became suboptimal from a performance perspective.
Since then, I've been reading up on VBA codes and this is what I came up with until now:
Sub test()
Dim r As Range, c As Range, msg As String
With Sheets("Backend - raw").Range("4:4").CurrentRegion
For Each r In Sheets("Backend - processed").Range("b7:t7")
Set c = .Rows(1).Find(r.Value, , , xlWhole, , 0)
If Not c Is Nothing Then
.Columns(c.Column).Copy
r.PasteSpecial xlPasteValues
Else
msg = msg & vbLf & r.Value
End If
Next
Application.CutCopyMode = False
End With
End Sub
The Range 4:4 is where the headers of the original raw data output are found. Range b7:t7 is where the headers of the processed data table are found.
Being a total beginner at VBA, I'm quite happy that it works, but still think that there is a huge margin for improvement:
1) It's still pretty slow, taking about 10 seconds to complete 40x500 arrays.
2) I don't know how to make it stop looking for the next header, if the last header was blank (end of range b7:t7)
3) I'm very much open to new/better approaches to tackle this issue.
Searching, copying and pasting can be quite time consuming affairs. You'd probably be better off reading the headers just once into some kind of stored list (a Collection would work well for you because it could store the column number as its value and the header text as its key.
Given that you are only copying and pasting values (ie you don't need to pass cell formatting into your processed sheet) then reading the values into an array and then writing that array to the sheet will be quicker.
The code below is an example of that, but I'm sure with more thought it could be made even quicker (for example by discarding a header from the collection once it's been used, or not having to find the last row number for every individual column).
Dim rawSht As Worksheet
Dim procSht As Worksheet
Dim headers As Collection
Dim c As Integer
Dim v As Variant
Set rawSht = ThisWorkbook.Worksheets("Backend - raw")
Set procSht = ThisWorkbook.Worksheets("Backend - processed")
Set headers = New Collection
For c = 1 To rawSht.Cells(4, Columns.Count).End(xlToLeft).Column
headers.Add c, rawSht.Cells(4, c).Text
Next
For c = 2 To 20
rawCol = headers(procSht.Cells(7, c).Text)
v = rawSht.Range(rawSht.Cells(5, rawCol), rawSht.Cells(Rows.Count, rawCol).End(xlUp)).Value2
procSht.Cells(8, c).Resize(UBound(v, 1)).Value = v
Next
This is using arrays (40 cols x 1000 rows in 0.03125 sec)
Option Explicit
Sub testArr()
Const HDR1 As Long = 4 'header row on sheet 1
Const HDR2 As Long = 7 'header row on sheet 2
Dim ws1 As Worksheet, ur1 As Range, vr1 As Variant, c1 As Long, c2 As Long, r As Long
Dim ws2 As Worksheet, ur2 As Range, vr2 As Variant, msg As String, t As Double
t = Timer
Set ws1 = Worksheets("Backend - raw")
Set ws2 = Worksheets("Backend - processed")
Set ur1 = ws1.UsedRange
Set ur2 = ws2.UsedRange.Rows(ws2.UsedRange.Row - HDR2 + 1)
Set ur2 = ur2.Resize(ur1.Row + ur1.Rows.Count - HDR1 + 1)
vr1 = ur1 'copy from Range to array
vr2 = ur2
For c1 = 1 To UBound(vr1, 2)
For c2 = 1 To UBound(vr2, 2)
If vr1(1, c1) = vr2(1, c2) Then
For r = 2 To UBound(vr1, 1)
vr2(r, c2) = vr1(r, c1)
Next
Exit For
Else
msg = msg & vbLf & vr1(HDR1, c1)
End If
Next
Next
ur2 = vr2 'copy from array back to Range
Debug.Print "testArr duration: " & Timer - t & " sec"
End Sub