I want to filter a large list of names in a Sheet in excel. In another sheet I have contained a list of names that I want to filter out and exclude from the larger list. How would I use the advanced filter to do this? I have tried this below but it is not seeming to work. My big list is in K2:K5000 and my criteria is in H2:H3 (The criteria will grow but I kept the list small for testing). Any help would be greatly appreciated!
Sub Filter()
Sheet5.Range("K2:K5000").AdvancedFilter Action:=xlFilterInPlace, _
CriteriaRange:=Sheets("Sheet3").Range("H2:H3"), Unique:=False
End Sub
To exclude the values in H2:H3 from K2:K5000 using advanced filter you can use following approach:
Make sure cell K1 is not empty (enter any header)
Find 2 unused cells (e.g. I1:I2)
Leave I1blank
Enter the following formula in I2
=ISNA(MATCH(K2,$H$2:$H$3,0))
Use the following code to exclude rows
Sheet5.Range("K1:K5000").AdvancedFilter Action:=xlFilterInPlace, _
CriteriaRange:= Sheets("Sheet3").Range ("I1:I2"), Unique:=False
I am not sure off the top of my head how you would use advanced filter to exclude, but you can use formulas in your advanced filter (near the bottom). You can, however, just use a dictionary to store values you want to exclude, then exclude (hide rows, or autofilter on the ones not found in your exclusion list)
Sub Filter()
Dim i as integer
Dim str as string
Dim dict As Object
Set dict = CreateObject("Scripting.Dictionary")
With Worksheets("Sheet3")
For i = 2 To 3
str = CStr(.Range("H" & i).Value)
If Not dict.exists(str) Then
dict.Add str, vbNullString
End If
Next i
End With
With Sheet5
For i = 2 To 5000
str = CStr(.Range("K" & i).Value)
If Len(str) > 0 And dict.exists(str) Then
.Range("K" & i).EntireRow.Hidden = True
Elseif
'alternatively, you can add those that aren't found
'to an array for autofilter
End if
Next i
End With
'If building autofilter array, apply filter here.
End Sub
Using AutoFilter:
Use an array of strings as criteria to filter on with the "Operator:=xlFilterValues" argument of AutoFilter. Build your array however you want, I chose to do it by building a string with a for loop and splitting (quick to write and test, but not ideal for a number of reasons).
Note: AutoFilter is applied to the headers, not data.
With Sheet5
.AutoFilterMode = False
.Range("K1").AutoFilter _
Field:=1, _
Criteria1:=arr, _
Operator:=xlFilterValues
End With
I think you need to understand first how to use the Advance filter.
There is a good tutorial you can find HERE.
Now based on that, let us make an example. Suppose you have below data:
Now, let us say you want to filter out Data1 and Data2.
According, to the link you can use a formula as criteria but:
Note: always place a formula in a new column. Do not use a column label or use a column label that is not in your data set. Create a relative reference to the first cell in the column (B6). The formula must evaluate to TRUE or FALSE.
So in our case, our relative reference is A11(the first cell or item in the field you want filtered). Now we make a formula in B2 since we cannot use A2, it is a Column Label. Enter the formula: =A11<>"Data1".
Above took care of Data1 but we need to filter out Data2 as well.
So we make another formula in C2 which is: =A11<>"Data2"
Once properly set up, you can now apply Advance Filter manually or programmatically. A code similar to yours is found below:
With Sheets("Sheet1")
.Range("A10:A20").AdvancedFilter xlFilterInPlace, .Range("A1:C2")
End With
And Hola! We have successfully filtered out Data1 and Data2.
Result:
It took me a while to get a hang of it as well but thanks to that link above, I manage to pull it of. I have learned something new as well today :-). HTH.
Additional:
I see that you have your criteria on another Sheet so you have to just use that in your formula. So if in our example you have Data1 and Data2 in H2:H3 in Sheet2, your formula in B2 and C2 is: =A11<>Sheet2!H2 and =A11<>Sheet2!H3 respectively.
You don't really even need VBA for this... to achieve the same result:
Put the values into a separate spreadsheet, in the first column.
Create 2 new columns next to the data you want to filter in your original spreadsheet
In the first column next to your data to be filtered, use
=VLOOKUP(A2, [nameOfOtherSpreadSheet.xlsx/xlsm/xls/etc]sheetName!$A:$A,1, FALSE)
Where A2 is the value you're searching for, field 2 is the reference of the range in which you want to search for this value, 1 is the index of the column in which you're searching, and FALSE tells VLOOKUP to only return exact matches.
In the second column next to the data you want to filter, use
=IFERROR(G2, FALSE)
Where G2 is the reference of the function that might return an error, and FALSE is the value you want to return if that function throws an error.
Filter the second column next to the data you want to filter for FALSEs
This should return the original data set without the values you wanted to exclude.
Record a macro to do this it's one step instead of 5 for future uses.
Related
I have a sheet, where I would like to filter the blank rows in column T and U.
I have certain cases to be considered.
I have few missing rows and have denoted them as missing in column S. If they are missing, I don't want them to be considered for filter condition. In Default they are blank.
The other case is, any one of the rows in column T and U are found blank, has to be filtered. IF both columns are blank, they also have to be filtered.
I have attached two images for reference. Could anyone suggest me how I could do it ? I am a beginner in VBA, Any lead would be helpful.
Sub FC()
Dim ws As Worksheet
Set ws = Sheets("FC")
With ws
.Range("A5:T1000").autofilter Field:=20, Criteria1:="=", Operator:=xlFilterValues
End With
End Sub
I tried the above code, It works with column T.
How can I include multiple criteria? Because with my cases, with column S as missing, I don't need to consider the complete row. And with my T and U, both blank or any one is blank, then I need them to be filtered.
This is how my sheet looks like in the beginning.
I would like to have a code, in such a way that, I want to filter the column T and S with blank rows, any of the rows in column T and U are found blank,
then I would like to filter them.
Ok so here's how you can achieve your custom filtering using a helper column. Let's take column Z for this mission.
Sub FC()
With Sheets("FC").Range("Z5:Z100")
.EntireColumn.Hidden = True ' <-- optional, to hide the temp column
.Formula = "=AND(S5<>""Missing"",OR(ISBLANK(T5),ISBLANK(U5)))"
.AutoFilter 1, True
End With
End Sub
I am new to VBA and am trying to delete unwanted columns loaded from a .csv file. I am importing a large amount of data but then I ask the user what columns they want to keep going by "ID num.". There are a lot of columns with different ID no. and I want to ask the user what they want to keep and delete the rest.
The problem is I need to delete all the other columns the user didn't want but I still need to keep the first 6 columns and the last two columns as that is different information.
Here is what I have so far:
Sub Select()
'the below will take the users inputs
UserValue = InputBox("Give the ID no. to keep seperating with a comma e.g"12,13,14")
'the below will pass the user inputs to the example to split the values
Call Example(UserValue)
End Sub
Sub Example(UserValue)
TestColArray() = Split(UserValue, ",")
For Each TestCol In TestColArray()
' keep all the columns user wants the delete the rest except the first 6 columns and last 2
Next TestCol
End Sub
That is what I have so far, it is not much but the user could put in a lot of columns with different ID number in the input box the way the Excel sheet is laid out all the ID no.s are in row 2 and the first 6 and last 2 columns are blank of row 2 since the ID no. does not apply. I hope that helps.
try this (commented) code:
Option Explicit '<--| use this statament: at the cost of having to declare all used variable, your code will be much easier to debug and maintainable
Sub MySelect()
Dim UserValue As String
'the below will take the users inputs
UserValue = Application.InputBox("Give the ID no. to keep seperating with a comma e.g: ""12,13,14""", Type:=2) '<--| use Type:=2 to force a string input
'the below will pass the user inputs to the example to split the values
Example UserValue '<--| syntax 'Call Example(UserValue)' is old
End Sub
Sub Example(UserValue As String)
Dim TestCol As Variant
Dim cellsToKeep As String
Dim firstIDRng As Range, lastIDRng As Range, IDRng As Range, f As Range
Set firstIDRng = Range("A2").End(xlToRight) '<-- first ID cell
Set lastIDRng = Cells(2, Columns.Count).End(xlToLeft) '<-- last ID cell
Set IDRng = Range(firstIDRng, lastIDRng) '<--| IDs range
cellsToKeep = firstIDRng.Offset(, -6).Resize(, 6).Address(False, False) & "," '<--| initialize cells-to-keep addresses list with the first six blank cells at the left of first ID
For Each TestCol In Split(Replace(UserValue, " ", ""), ",") '<--| loop through passed ID's
Set f = IDRng.Find(what:=TestCol, LookIn:=xlValues, lookat:=xlWhole, MatchCase:=False) '<--| search for the current passed IDs range
If Not f Is Nothing Then cellsToKeep = cellsToKeep & f.Address(False, False) & "," '<--| if the current ID is found then update cells-to-keep addresses list
Next TestCol
cellsToKeep = cellsToKeep & lastIDRng.Offset(, 1).Resize(, 2).Address(False, False) '<--| finish cells-to-keep addresses list with the firts two blank cells at the right of last ID
Range(cellsToKeep).EntireColumn.Hidden = True '<-- hide columns-to-keep
ActiveSheet.UsedRange.EntireColumn.SpecialCells(xlCellTypeVisible).EntireColumn.Delete '<--| delete only visible rows
ActiveSheet.UsedRange.EntireColumn.Hidden = False '<-- unhide columns
End Sub
it's assumed to be working with currently active worksheet
A simple google search produces this. On the first page of results too. Perhaps this will suit your needs.
If the data set that needs to be deleted is really large (larger than the ranges you want to keep too.) Then perhaps only select the columns you want to have whilst you import the csv? This stackoverflow question shows how to import specific columns.
EDIT:
So from what I believe the OP is stating as the problem, there is a large csv file that is being imported into excel. After importing there is alot of redundant columns that should be deleted. My first thought would be to only import the needed data (columns) in the first place. This is possible via VBA by using the .TextToColumns method with the FieldInfo argument. As stated above, the stackoverflow question linked above provides a means of doing so.
If the selective importing is not an option, and you are still keen on making an inverse of the user selection. One option would be to create 2 ranges (one being the user selected Ranges and the second being the entire sheet), you could perform an intersect check between the two ranges and delete the range if there is no intersection present (ie. delete any cell that is not part of the users selection). This method is provided by the first link I supplied and is quite straight forward.
I'm trying to find/make a program that will find all my duplicate words in Excel. For example in A1 "someone" in A2 ""person" and etc but I'll have "someone" multiples times or another word and I need to condense that information together. But I need to do it in a way where I don't search manually to concatenate duplicates. So is there a way to find the duplicate words and concatenate them?
I have also been looking into doing it using "FIND" to look for them but it has yielded no luck yet. I also have been using the "FILTER" but I don't know a way to condense the duplicates without doing it manually. I also been wondering where you can find the code for functions like "FIND, REPLACE and ect."? If I could find that I could change the coding for "REMOVE DUPLICATES" to change it for words. But hey I don't really know if that really would work or not. Anything would help.
For example:
column1 column2 column3
-----------------------------
y A (nothing)
z B (nothing)
z (nothing) I
x (nothing) k
y (nothing) j
x C (nothing)
to this
column1 column2 column3
-----------------------------
y A j
z B I
x C k
except the letters are words.
I don't know if you could do this with formulas in Excel unless you know what word you are looking for within the cell. You could try either a UDF, or a Regular Expression.
my question and answer with links might get you started:
StackOverflow: formula to see if a surname is repeated within a cell
and maybe:
VBA Express
Once you've posted your Excel worksheet with data we see if I've got it wrong!
You could use advanced filter to copy unique values from column 1 to a new column. Then you would use a vlookup formula to get the rest.
Assumptions:
Row 1 is a header row so actual data starts in row 2
Column1 is column "A"
Column2 is column "B"
Column3 is column "C"
The new column with the unique values is column "E".
In cell F2 and copied over to G2 and then down as needed:
=INDEX(INDEX($B$2:$C$7,0,COLUMNS($E2:E2)),MATCH(1,INDEX(($A$2:$A$7=$E2)*(INDEX($B$2:$C$7,0,COLUMNS($E2:E2))<>""),),0))
Sheet1 Before:
Code:
Sub Macro1()
With Sheet1
.Columns("A:A").AdvancedFilter Action:=xlFilterCopy, _
CriteriaRange:=.Range("F1:F2"), CopyToRange:=.Range("K1"), Unique:=True
.Columns("B:B").AdvancedFilter Action:=xlFilterCopy, _
CriteriaRange:=.Range("G1:G2"), CopyToRange:=.Range("L1"), Unique:=True
.Columns("C:C").AdvancedFilter Action:=xlFilterCopy, _
CriteriaRange:=.Range("H1:H2"), CopyToRange:=.Range("M1"), Unique:=True
End With
End Sub
Sheet1 After:
make sure field names are used.
This will give you a function that will find the first non-blank cell against a specific string
Option Explicit
Function NonBlankLookup(SearchTxt As String, LookIn As Range, OffSetRows As Long) As Variant
Dim loc As Range
Dim FirstFound As Range
Set loc = LookIn.Find(what:=SearchTxt)
While Not (loc Is Nothing)
If Not IsEmpty(loc.Offset(0, OffSetRows)) Then
NonBlankLookup = loc.Offset(0, OffSetRows).Value
Exit Function
End If
If FirstFound Is Nothing Then
Set FirstFound = loc
ElseIf loc = FirstFound Then
NonBlankLookup = CVErr(2000)
Exit Function
End If
Set loc = LookIn.Find(what:=SearchTxt, after:=loc)
Wend
NonBlankLookup = CVErr(2000)
End Function
to use, insert this code into a module, then in your excel spreadsheet, you can use a formula like =NonBlankLookup(E1,$A$1:$A$6,1) which will search for your text in A1:A6, and check 1 column to the right. If no text is found that matches the search string, or if the text is found but no data exists in the specified column, #NULL! is returned.
This also has a slight advantage to vlookup, as it will allow negative offset, so you could have the search text in column 2, and by using -1 for the offset, you could return data from column 1
Just so you are aware, because of the way that .find works, when you specify a range, it will start at the 2nd cell, and go down, and search the first cell you give it last.
e.g. with my example of A1:A6, it will search A2,A3,A4,A5,A6 and finally A1
A lot of the solutions here on SO involve using CountIf to find duplicates. When I have a list of 100,000+ values however, it will often take minutes for CountIf to search for duplicates.
Is there a quicker way to search for duplicates within an Excel column WITHOUT using CountIf?
Thanks!
EDIT #1:
After reading the comments and replies I realize I need to go into greater detail. Let's pretend I'm a birdwatcher, and after I return from a birdwatching trip I input anywhere from 1 to 25 or 50 new birds that I saw on my trip into my "Master List of Birds Seen". This is really a dynamically growing list, and with each addition I want to make sure I'm not duplicating something that already exists in my list.
So, in column A of my file are the names of the birds. Column B-M might contain other attributes of the birds. I want to know if a bird that I just added in column A after my latest birdwatching trip ALREADY exists somewhere ELSE in my list. And, if it does, I would manually merge the data of the 2 entries and throw away some and keep some after careful review. I clearly don't want to have duplicate entries of the same bird in my database.
So, ultimately I want some indication that there is or isn't a duplicate somewhere else, and if there is duplicate please tell me what row to look in (or highlight or color both of the duplicates).
The fastest way that I know of (in case you are using Excel 2007/2010/2011) is to use Data (In Ribbon) | Remove Duplicates to find the total number of duplicates OR to remove duplicates. You might want to move data to a temp sheet before you test this.
The 2nd fastest way is to use Countif. Now Countif can be used in many ways to find duplicates. Here are two main ways.
1) Inserting a New Column next to the data and putting the formula and simply copying it down.
2) Using Countif in Conditional formatting to highlight cells which are duplicates. For more details, please see this link.
suggestions for a macro to find duplicates in a SINGLE column
EDIT:
My Apologies :)
Countif is the 3rd fastest way!
The 2nd fastest way is to use Pivot Tables ;)
What exactly is your main purpose of finding duplicates? Do you want to delete them? Or Do you want to highlight them? Or something else?
FOLLOWUP
Seems like I made a typo in the formula. Yes for large number of rows, CountIf does take minutes as you suggested.
Let me see if I can come up with a VBA code to suit your exact needs.
Sid
You can use VBA - the following function returns a list of unique entries within a list of 100,000 in less than a second. Usage: select a range, type the formula (=getUniqueListFromRange(YourRange)) and validate with CTRL+SHIFT+ENTER.
Public Function getUniqueListFromRange(parRange As Range) As Variant
' Returns a (1 to n,1 to 1) array with all the values without duplicates
Dim i As Long
Dim j As Long
Dim locKey As Variant
Dim locData As Variant
Dim locUniqueDict As Variant
Dim locUniqueList As Variant
On Error GoTo error_handler
locData = Intersect(parRange.Parent.UsedRange, parRange)
Set locUniqueDict = CreateObject("Scripting.Dictionary")
On Error Resume Next
For i = 1 To UBound(locData, 1)
For j = 1 To UBound(locData, 2)
locKey = UCase(locData(i, j))
If locKey <> "" Then locUniqueDict.Add locKey, locData(i, j)
Next j
Next i
If locUniqueDict.Count > 0 Then
ReDim locUniqueList(1 To locUniqueDict.Count, 1 To 1) As Variant
i = 1
For Each locKey In locUniqueDict
locUniqueList(i, 1) = locUniqueDict(locKey)
i = i + 1
Next
getUniqueListFromRange = locUniqueList
End If
error_handler: 'Empty range
End Function
If using Excel 2007 or later (which is likely from the 100,000+ values) you can choose:
Home Tab | Conditional Formatting > Highlight Cell Rules > Duplicate Values...
Right-click a highlighted cell and filter by selected cell color to show just the duplicates (be aware however this can be slow with conditional formatting).
Alternatively run this code and filter for colored cells which takes only a second on 100,000 cells:
Sub HighlightDupes()
Dim i As Long, dic As Variant, v As Variant
Application.ScreenUpdating = False
Set dic = CreateObject("Scripting.Dictionary")
i = 1
For Each v In Selection.Value2
If dic.exists(v) Then dic(v) = "" Else dic.Add v, i
i = i + 1
Next v
Selection.Font.Color = 255
For Each v In dic
If dic(v) <> "" Then Selection(dic(v)).Font.Color = 0
Next v
End Sub
Addendum:
To select only duplicate values without code or formulas, i have found this method useful:
Data Tab | Advanced Filter... Filter in Place, Unique Records Only, OK.
Now select the range of unique values and press Alt+; (Goto Special... Visible cells only). With this selection clear the filter and you will see that all unselected cells are duplicates, you can then press Ctrl+9 (Hide Rows) to show just the duplicates. These rows can be copied to another sheet if needed or marked with an "X".
You do not mention what you want to do when you find them. If you merely want to see where they are...
Sub HighLightCells()
ActiveSheet.UsedRange.Cells.FormatConditions.Delete
ActiveSheet.UsedRange.Cells.FormatConditions.Add Type:=xlCellValue, Operator:=xlEqual, Formula1:=ActiveCell
ActiveSheet.UsedRange.Cells.FormatConditions(1).Interior.ColorIndex = 4
End Sub
Preventing Duplicates with Data Validation
You can use Data Validation to prevent you entering duplicate bird names. See Debra Dalgelish's site here
Handling existing duplicates
My free Duplicate Master addin will let you
Select
Colour
List
Delete
duplicates.
But more importantly it will let you run more complex matching than exact strings, ie
Case Insensitive / Case Sensitive searches (sample below)
Trim/Clean data
Remove all blank spaces (including CHAR(160)) see the " mapgie" and "magpie" example below
Run regular expression matches (for example the sample below replaces s$ with "" to remove plurals)
Match on any combination of columns (ie Column A, all columns, Column A&B etc)
I'm surprised that no one has mentioned the RemoveDuplicates method.
ActiveSheet.Range("A:A").RemoveDuplicates Columns:=1
This will simply remove any duplicate entries on the active worksheet in column A. It takes milliseconds to run (tested with 200k rows). Mind you, this will strictly delete all the duplicate entries. Although that isn't how the original question was worded, I do believe that this still serves your purpose.
One simple way of finding unique values is to use the advance filter and filter for unique values only and copy and paste them into other sheet as when the pivot is removed you will get the whole data with the duplicate in them.
Sort the range
and in next column put `=if(a2=a1;1;if(a2=a3;1;0))
"1" will be displayed for duplicates.
I have a fairly simple syntax question:
I'm trying to copy and paste n rows from one excel file to another. In addition, I'd like to store the total copied rows into a variable.
Can someone help me accomplish this?
For example:
1)
Activate CSV file
Apply Filter to Column B (Page Title) & uncheck "blanks" ("<>") filter**
Windows("Test_Origin.xlsm").Activate
ActiveSheet.Range("$A$1:$J$206").AutoFilter Field:=2, Criteria1:="<>"
2)
Copy Filtered Lines with data (Excluding Row 1)
Range("B2:F189").Select
Selection.Copy
copiedRowTotal = total *FILTERED* rows copied over from original sheet, then Test Number iterates that many times
copiedRowTotal = Selection.Rows.Count
MsgBox copiedRowTotal
Thanks
An indirect way to do this is
Range("B2:F189").Copy
Range("M2").PasteSpecial xlPasteValues
copiedRowTotal = Selection.Rows.Count
Selection.Clear
The code copies the range & does a paste special operation on a separate location.
By doing this, only filtered rows are copied to M2 & the area (where the filtered rows are pasted) is highlighted when PasteSpecial operation is done.
Doing a Selection.Rows.Count gives one, the number of filtered rows that were pasted.
After figuring out the number of filtered rows, the selection is cleared up.
I don't believe there is a way to get the visible cell count directly. I tried using the 'SpecialCells(xlSpecialCellsVisible)' function, but could not get the correct count with a filter applied. Here is a quick function I wrote that works with a filter applied.
Also be aware that sometimes a filter can mess with the selected range at times, so it's something to note.
Public Sub TestIt()
Dim visibleCount As Long
visibleCount = GetVisibleCount(Sheets(1).Range("A2:H3000"))
MsgBox visibleCount
End Sub
Public Function GetVisibleCount(rng As Range) As Long
Dim loopRow As Range
GetVisibleCount = 0
For Each loopRow In rng.Rows
If loopRow.Hidden = False Then
GetVisibleCount = GetVisibleCount + 1
End If
Next loopRow
End Function
copiedrowtotal = selection.rows.count ' its not selection.totalcells
I think this would do the trick
After seeing your update let me tell you probably these would work
dim i as long
i = Application.WorksheetFunction.Subtotal(2,worksheets("Sheet").Range("B2:F189"))
Now i has the number of filtered rows in it! If you have included header in your range then do -1 at the end else just leave it up
argument 2 in subtotal is => counting the rows and then sheet name
and then specify range to count filtered rows
instead I would select only one column if you applied filter for many columns!
Hope it helps dont forget to accept an answer ! :