VBA: Delete duplicate entries (Row) in column S based on the lowest price in column Q - vba

I'm new to vba and have got help here earlier with some issues I had with vba macros, now I need help again.
I have an excel file with tons of data and I have huge amounts of duplicate EAN numbers in column S, I want to delete all duplicate EANs (the entire rows with the duplicates) but keep the one with lowest price (column Q), so I want to compare duplicate EANs from column S and delete all duplicates based on the lowest prices in column Q and keep the cheapest one. It's alot of data, more than 10000 rows, so do this manually by a formel is not the best way, takes alot of time to delete this rows manually.
example below (the first is price and second should be an ean):
104,93 - 000000001
104.06 - 000000001
104.94 - 000000001
in this case I want to delete first and third row and keep the second, anyone knows how the macro should look like, i use Excel 2010 ?

This might help you.
I assume you have a Header row. If not, change iHeaderRowIndex to 0 .
The first part creates a dictionary object, collects all the unique EAN numbers, and for each EAN it assigns a very high price (10 million)
Then it rescans the list, this time does a "MIN" logic to determine the lowest price per EAN.
Another rescan, this time it puts a MIN mark in a free column next to each min EAN (you should choose a name of a free and empty column - I put in "W" but you can change that)
Last, it rescans the list, in reverse order, to delete all the lines that do not have the MIN mark. Also, at the end, it deletes the column with the MIN marks.
Public Sub DoDelete()
Dim oWS As Worksheet
Dim d As Object, k As Object
Dim a As Range
Dim b As Range
Dim sColumnForMarking As String
Dim iHeaderRowIndex As Integer
Dim i As Integer
Dim iRowsCount As Integer
Dim v As Double
Set oWS = ActiveSheet
Set d = CreateObject("scripting.dictionary")
' ----> Put here ZERO if you do not have a header row !!!
iHeaderRowIndex = 1
' ----> Change this to what ever you like. This will be used to mark the minimum value.
sColumnForMarking = "W"
' Selecting the column "S"
Set a = _
oWS.Range(oWS.Cells(1 + iHeaderRowIndex, "S"), _
oWS.Cells(ActiveSheet.UsedRange.Rows.Count, "S"))
' putting a high number, one that is beyond the max value in column Q
' ----> Change it if it is too low !!!!
For Each b In a
d(b.Text) = 9999999 ' very high number, A max++ to all the prices
Next
For Each b In a
v = CDbl(oWS.Cells(b.Row, "Q").Value)
If v < CDbl(d(b.Text)) Then
d(b.Text) = v
End If
Next
For Each b In a
v = CDbl(oWS.Cells(b.Row, "Q").Value)
If v = CDbl(d(b.Text)) Then
oWS.Cells(b.Row, sColumnForMarking).Value = "MIN"
End If
Next
' This part deletes the lines that are not marked as "MIN".
iRowsCount = oWS.UsedRange.Rows.Count
Application.ScreenUpdating = False
For i = iRowsCount To iHeaderRowIndex + 1 Step -1
If oWS.Cells(i, sColumnForMarking).Text <> "MIN" Then
oWS.Rows(i).Delete Shift:=xlShiftUp
End If
Next
' clean up- deletes the mark column
oWS.Columns(sColumnForMarking).EntireColumn.Delete
Application.ScreenUpdating = True
End Sub

Related

Derive cell value of an Excel Table based on two parameters

I have 2 columns in excel, A and B. In A I have percentages (rates) and in B integers numbers (years).
rating PD year
0.39% 3
0.88% 2
1.32% 17
0.88% 1
0.26% 15
0.17% 2
0.17% 2
0.59% 2
0.59% 2
Then I have a Table in which in column F I have years and in row I have text.
Like this (the table is much bigger and years go up to 30):
Rating
Year AAA AA+ AA AA-
1 0.003% 0.008% 0.018% 0.049%
2 0.016% 0.037% 0.074% 0.140%
3 0.041% 0.091% 0.172% 0.277%
4 0.085% 0.176% 0.318% 0.465%
5 0.150% 0.296% 0.514% 0.708%
And so on (the table is much bigger than this).
So I would need a function, or a shortcut, which, for a given rate in column A and a given year in column B, gives me, in column C, the corresponding rating (AAA,AA+,AA etc.).
In the table the rates are the maximum. So if I have A1=0.50% and B1=2, then I go to look at the table, year 2 and corresponding rate, which is 0.74% (and therefore AA), because AA+ is 0.37% and is too low.
In other words, AA+ and year 2 are all the rates between 0.16% and 0.37%. And AA with year 2 are all the rates between 0.37% and 0.74%.
Do you know how I could perform this task?
Thank you very much.
For the sake of code readability, I've used two custom-made functions, alongside the main procedure shown here. Otherwise it would be a huge code-dump.
Before you begin, you have to change/check these data fields.
The (blue) data table needs to be named "scores" (or changed inside code to your own name)
Same goes for the (green) grades table - to be named "grades" and start in F1
Last but not least, the code presumes these two tables are in a sheet called "Sheet1"
So all of this needs to be changed within the code, if the names do
not match!
Now to the procedure:
Option Explicit
Private Sub run_through_scores()
Dim scores As ListObject ' table from A1
Dim grades As ListObject ' table from F1
Set scores = Sheets("Sheet1").ListObjects("scores")
Set grades = Sheets("Sheet1").ListObjects("grades")
Dim cell As Range ' for "for" loop
Dim inrow As Long ' will store in which row the year is
Dim resultColumn As Integer ' will store in which column the percentage is
'for every cell in second column of scores table (except header)
For Each cell In scores.ListColumns(2).DataBodyRange
inrow = get_year(cell).Row - 1
' ^ returns Row where result was found, -1 to accoutn for header
'using our get_interval() function, _
determines in which column is the sought percentage
resultColumn = get_interval(cell.Offset(0, -1), inrow).Column
cell.Offset(0, 1) = Sheets("Sheet1").Cells(1, resultColumn)
'write result in Column C ^
Next cell
End Sub
And to the functions:
get_year()
returns a Range Object from the "grades" table, in which we found
the matching year from our "scores" table. If the desired year is not found, it returns the year closest to it (the last table row)
' Returns a Range (coordinates) for where to search in second table
Private Function get_year(ByVal year As Variant) As Range
Dim grades As ListObject ' table from F1
Set grades = Sheets("Sheet1").ListObjects("grades")
Dim testcell As Range
Set testcell = grades.ListColumns(1).DataBodyRange.Find(year, LookIn:=xlValues)
'if found
If Not testcell Is Nothing Then
Set get_year = testcell
Else
Dim tbl_last_row As Long 'if year not found, return last row
tbl_last_row = grades.ListColumns(1).DataBodyRange.Rows.Count
Set get_year = grades.ListColumns(1).Range(tbl_last_row)
End If
End Function
And the second function:
get_interval()
returns a Range Object from the "grades" table. It compares individual cell ranges and returns upon a) if the sought percent from "scores" is less or equal (<=) then current cell percent or b) if we went through all the cells, it returns the last cell
(because it must be higher, than the maximum of specified interval)
Private Function get_interval(ByVal what As Variant, ByVal inyear As Long) As Range
Dim grades As ListObject ' table from F1
Set grades = Sheets("Sheet1").ListObjects("grades")
Dim cell As Range
For Each cell In grades.ListRows(inyear).Range
'check for interval
If what <= cell And cell.Column <> 6 Then 'we don't want to check year column
Set get_interval = cell
Exit Function
End If
Next cell
' if we arrived here, at this stage the result will always be the last cell
Set get_interval = grades.ListRows(inyear).Range(, grades.ListColumns.Count)
End Function
Upon firing (invoking) the run_through_scores() procedure, we get the results as expected:
if you have any questions, please let me know :)

Using VBA to find start value, count rows till that value becomes 0 and record result. Repeat for same column until the end of the data reached

I'm a newbie to VBA/coding in general and my usual tactic of sticking bits of pre-written code isn't working for my problem.
I'm looking to create a macro that will do 3 things:
Allow me to find a starting point for the data in a column.
Start counting the number of rows once the cell value has
changed to a constant.
Once the value moves back to the starting point for the count to stop and record the number of cells counted in separate column with positioning of the count in that column at the start point of the count.
Repeat until the end of the data.
For this case the start point will be when the cell has a value of >0.
It will increase to a constant number (300).
Once at 300 the macro will have to count the number of rows that contain the numerical value 300 until the value goes back to 0.
Report count in a separate table on the worksheet with the entry being input at the same relative position in the new table as when the count started from the data.
And finally the loop.
I need to also do a similar count but in the horizontal direction (i.e. counting columns on a row). If anyone can create a code for the vertical/row count problem above I'd really appreciate it if you could annotate it so I can attempt to understand/learn which bits of code carry out each action and thus change it up for horizontal/column count.
I've attached a screenshot of the spreadsheet however as a new user it must be as a link. The blue highlighted table is the data used for the vertical /row count problem I am talking about. The blank table underneath the highlighted table has manually inputted correct answers for the first column of data for what I would like the macro to do in case I haven't accurately described my request.
I have also attached the horizontal table with correct manually inputted answers for row 1 in the separate table for the column count along the row.
Lastly, here is the code that I have written to tackle the problem, however it is very basic and won't run.
Sub Count0()
For Each c In Worksheets("Sheet1").Range("D30:D39")
If c.Value = 0 Then
End If
If c.Value > 0 Then
v = Range(c.Value)
For i = 3 To Rows.Count
If Cells(i, 1).Value <> v Then
MsgBox CStr(i - 2)
End If
Next i
Next c
End Sub
This worked in the limited case I tested (two columns and several rows in different patterns. It's pretty basic--there are more elegant ways to do it.
Sub Count0()
'To hold the current cell
Dim current As Range
'To hold the total number of rows and columns having data
Dim rows As Long
Dim cols As Long
'To iterate across rows and columns
Dim r As Long
Dim c As Long
'Flag/counter variables
Dim found As Long 'Saves row on which first "constant" was found
Dim count As Long 'Saves count of "contants"
'Use SpecialCells method to obtain the maximum number of rows and columns
' that have data.
cols = Worksheets("Sheet1").Cells.SpecialCells(xlCellTypeLastCell).Column
rows = Worksheets("Sheet1").Cells.SpecialCells(xlCellTypeLastCell).Row
'Increment through all columns that have data. This is a bit inefficient
' because it really isn't necessary to go through all the empty cells,
' but it works.
For c = 1 To cols
'Initialize flag/counter
found = 0
count = 0
'Increment through all rows for the current column.
For r = 1 To rows
'Examine the current cell
Set current = Worksheets("Sheet1").Cells(r, c)
'For positive values, save the first row that has the value
' and count the number of values.
If current.Value > 0 Then
If found = 0 Then found = r
count = count + 1
End If
'When the next non-positive value is reached--OR the end of the
' row is reached--and there was a constant found, write the count
' to the next worksheet in the cell corresponding to the row and
' column having the first instance of the constant.
If (current.Value <= 0 Or r = rows) And found > 0 Then
Worksheets("Sheet2").Cells(found, c).Value = count
'Reset the flag/counter
found = 0
count = 0
End If
Next r
Next c
End Sub
I was struggling with what you had written, and ended up doing this in the end. I left you variables for changing the sheets to read from and print to (assuming you can print the results to another sheet- if not it should be easy enough to change).
This should also work for all cells in your range, assuming that there are values in all boxes.
Problems I noted with your original code were:
The first if did nothing
I'm pretty sure you shouldn't use numbers in sub/function names
Dimensioning no variables is a bad idea
Anyway, give me a comment if you need any help (and well done for writing a good first question).
Sub CountZero()
Dim SourceSheet As Worksheet, SummarySheet As Worksheet
Dim CurrentCell As Range
Dim FirstRow As Long, LastRow As Long
Dim FirstColumn As Long, LastColumn As Long
Dim TotalValues As Long
Set SourceSheet = Worksheets("Sheet1")
Set SummarySheet = Worksheets("Sheet2")
FirstRow = 1
LastRow = SourceSheet.Range("A" & rows.count).End(xlUp).row
FirstColumn = 1
LastColumn = SourceSheet.Cells(1, Columns.count).End(xlToLeft).column
For col = FirstColumn To LastColumn
For Rw = FirstRow To LastRow
Set CurrentCell = SourceSheet.Cells(Rw, col)
If CurrentCell <> 0 Then
TotalValues = ProcessSection(CurrentCell)
SummarySheet.Cells(Rw, col).value = TotalValues
Rw = Rw + TotalValues
End If
Next Rw
Next col
End Sub
Function ProcessSection(FirstCellWithValue As Range) As Long
Dim Counter As Long: Counter = 0
Do Until FirstCellWithValue.Offset(Counter, 0).value <> FirstCellWithValue.value
Counter = Counter + 1
Loop
ProcessSection = Counter
End Function
As a small disclaimer, I haven't tested this, let me know if there are problems.

VBA Excel word search and copying formulas

I'm searching for a VBA macro for Excel, which can detect the word "mean", in column A. After this it would copy the yellow row with the formula in C to J.
The formula counts the average from one row after the last "mean" to the next =AVERAGE (C1323:C1437)
after every sixth mean there also needs to be Area and 150 copyied two rows after mean and I and J Need to be changed. Consequently I and J would refer to the cell A1441 in this case (=G1439/C1439*$A$1441) till the end of the file.
I'm not quite sure if it's easy or not but I'm totally overchallenged. I would be very thankful for help.
Sub Makro1()
'
' Makro1 Makro
'
' Tastenkombination: Strg+q
strSearchWord = "Mean"
i = Application.WorksheetFunction.CountIf(Range("A:A"), strSearchWord)
Y = 2
For x = i To 0
i = Application.WorksheetFunction.Match(strSuchWort, Range("A:A"), 0)
Range("C" & i).Select
Application.CutCopyMode = False
ActiveCell.FormulaR1C1 = "=AVERAGE(R[-147]C:R[-1]C)" ' that's still wrong, should be something like i-y?
Selection.AutoFill Destination:=Range("C" & i:"J" & i), Type:=xlFillDefault
Range("CY:JY").Select
i = Y
'for each fifth i
'Range("A" & i + 3).Select
' ActiveCell.FormulaR1C1 = "=RC[-2]/RC[-6]*R2159C1"
Next x
End Sub
it's still wrong, but my first draft.
#stucharo the Area correction is difficult to describe I've added a better Picture with formulas. I hpe that now it's understandable
If your line ActiveCell.FormulaR1C1 = "=AVERAGE(R[-147]C:R[-1]C)" needs to change the number of rows betwen means each time then you'll need to add a variable as you comment suggests. Also, just writing the string to the cells value (ActiveCell.Value) means that you will see it written as a formaula when you click the cell in the workbook (and it'll highlight the range etc.). You could try replacing it with:
ActiveCell.Value = "=AVERAGE(R[" & i - Y & "]C:R[-1]C)"
although since I can't see the first row of your sheet I'm not certain that'll give you the correct range of rows each time.
If your row number is likely to change and you are copying over the same number of columns each time then it might also be just as easy to write the formula directly to cells within a loop, rather than explicitly copying it.
Adding text after every 6th "mean" would require you to keep count of how many means had passed so far. This can be done by incrememnting a counter variable and using the Mod operator will tell you the remainder after a division. Therefor numberOfMeans Mod 6 will give you the remainder when divided by 6 and when this equals zero you know you have a multiple of 6. I've tried to capture all this into the code below.....
Sub Test()
Application.ScreenUpdating = False
Dim startRow As Integer
startRow = 2
Dim endrow As Integer
endrow = Range("A2").End(xlDown).row
Dim lastMeanRow As Integer
lastMeanRow = startRow - 1
Dim areaRow as Integer
areaRow = lastMeanRow + 3
Dim meanCounter As Integer
meanCounter = 0
Dim avgColHeight As Integer
Dim col As Integer
Dim row As Integer
'Check each row in the sheet
For row = startRow To endrow
'Cols i and j in every row need to be modified
For col = 9 To 10
Cells(row, col).Value = "=RC[-2]/RC[-6]*R" & areaRow & "C1"
Next col
'If column 1 of that row contains "mean" then
If Cells(row, 1).Value = "mean" Then
'Calculate the column height to average over....
avgColHeight = row - lastMeanRow - 1
'...and loop through each of the columns....
'(including i and j to add average)
For col = 3 To 10
'....inserting the averaging formula.
Cells(row, col).Value = "=AVERAGE(R[-" & avgColHeight & "]C:R[-1]C)"
Next col
'Then increment the counter to keep track of the number of means
meanCounter = meanCounter + 1
'If the number of means is a multiple of 6 then
If (meanCounter Mod 6 = 0) Then
'insert the "Area" and "150" strings
Cells(row + 2, 1).Value = "Area"
Cells(row + 3, 1).Value = "150"
areaRow = row + 3
End If
'Finally change the lastMeanRow to the mean row we have just processed.
lastMeanRow = row
End If
'Do it again until we reach the end of the data
Next row
Application.ScreenUpdating = True
End Sub
I also noticed your point on the value of area changing periodically. Writing this programatically, as above, will aloow you to add some logic over the value of "Area" and when it changes.
You clearly have a long list of data and want to automate the creation of the rows and formulas you describe.
It is possible write VBA to scan through the data and modify the formulas etc but first I would question if this is the best approach to give you what you need.
Excel has a feature called "pivot tables" which essentially allows you to summerise data in a list.
for instance if the list had one row for each city in the world and gave the population in the city, and a column gave which country it was in. A pivot table could be used to create the average population for a country of the countries cities. I suspect you are doing this sort of thing.
If you don't know about pivot tables you should find out about them. See here
In your case your mean row is summeriseing data in the rows above it. To use pivot tables you would have to have a column that defined which group each row is in. You pivot table would sue this column as a row summary and you would then create the average for all the other column.
#Nathalie. It's hard to help without knowing more. eg Is the data delivered with the mean text already inserted. It looks like column A has a number the represent the row number within the group (and this could be used by a formula to create the "Group Name" column you need for pivot tables.
You can get the pivot tables to do the area adjustment by:
Creating a new set of columns which contains formulas that cause the values in columns C to J to be copied except for when it is the 6th set of data in which case you adjust the values in C to J accordingly).
You probably need to introduce columns that:
A. give the "group name"
B. give a count of which group it is in so every 6th you can do the adjustment you need.
4 by using pivot tables and basic techniques you will find it easie rot update the refresh the data, should you need to.

vb excel keep 1 instance of duplicate items in a range

Hi I am using VB to populate data in excel. In the sheet, the column G has many cells with same numbers(and they are repeated without following any pattern). First I would like to find which entries are not unique and then keep the first occurrence in the column & delete the entire rows where repetitions are encountered. Here's an example:
As can be seen from the image, in the column G, numbers 1000 & 2200 are repeated. So need to delete entire rows 3 and 6 (keeping rows 1 & 2 where 1000 & 2200 appear first).
Here's the code which I can't get to work:
Sub Dupli()
Dim i As Long, dic As Object, v As Object
dic = CreateObject("Scripting.Dictionary")
i = 1
For Each v In sheet.UsedRange.Rows
If dic.exists(v) Then sheet.Rows(v).EntireRow.Delete() Else dic.Add(v, i)
i = i + 1
Next v
End Sub
Try something like this. I don't think you need a dictionary (unless there is some other need for it elsewhere in your code). When deleting objects, it's usually necessary to iterate backwards over the collection. This method just uses the CountIf function to test whether the cell value in column G of a specific row occurs more than once in all of column G, and deletes the row if that condition is true.
Sub Dupli()
Dim i As Long
Dim cl as Range
i = 1
For i= sheet.UsedRange.Rows.Count to 1 Step -1
Set cl = sheet.Cells(i,7) '## Examine the cell in Column G
If Application.WorksheetFunction.CountIf(sheet.Range("G:G"),cl.Value) > 1 Then
sheet.Rows(i).EntireRow.Delete
Next
End Sub
Put this in H1:
=COUNTIF(G$1:G1;G1)
Fill down to end
Make an autofilter on column G
Filter out values of 1
Select the remaining rows by row header
Right click on row header > click Delete ...

How to look for values with % in a column and then sum them up and place it in an empty row?

I was wondering if there is any method that could detect a column with %? After that, sum the rows that have the %?
For example, in the following, i have a picture, Column B -D has values that contents %. The total % in B will be 130%, C will be 105%, D will be 90%. The total will be place at the subsequent empty row, for this particular example, will be row 8. I need the row to be flexible as it may end up more than 8.
I hope someone understands what i'm trying to convey here. What i'm trying to do is to look for the % and sum them up, the total will be placed at the subsequent empty row.
I do not understand the follow-up question about percentages in column 1. The following is my solution to your original question. It creates a row below any existing rows containing the total of any percentages in the column. I have formatted the values and have coloured then blue to show how. I have included comments saying what I am doing but not explaining why a statement does what it does. Try VBA Help first then come back with any questions.
Option Explicit
Sub SumPercentages()
Dim ColCrnt As Long
Dim ColCrntMax As Long
Dim ColMax As Long
Dim RowCrnt As Long
Dim RowCrntMax As Long
Dim RowSum As Long
Dim TotalPerc As Double
With Sheets("Sheet1") ' Replace with name of your sheet
' Find the last used row plus one as the row on which to place
' the totals. Warning, this statement will throw an error if
' there is a value on the last possible row.
RowSum = .Cells.SpecialCells(xlCellTypeLastCell).Row + 1
' Find the last used column
ColMax = .Cells.SpecialCells(xlCellTypeLastCell).Column
' Examine every used column
For ColCrnt = 1 To ColMax
' Find the last used row in this column
RowCrntMax = .Cells(Rows.Count, ColCrnt).End(xlUp).Row
TotalPerc = 0
' Examine each row of this column
For RowCrnt = 1 To RowCrntMax
If Right(.Cells(RowCrnt, ColCrnt).NumberFormat, 1) = "%" Then
TotalPerc = TotalPerc + Val(.Cells(RowCrnt, ColCrnt).Value)
End If
Next
' Save the value, format as percentage, colour blue
With .Cells(RowSum, ColCrnt)
.Value = TotalPerc
.NumberFormat = "0%"
.Font.Color = RGB(0, 0, 255)
End With
Next
End With
End Sub