How to delete unselected columns from range - vba

I am new to VBA and am trying to delete unwanted columns loaded from a .csv file. I am importing a large amount of data but then I ask the user what columns they want to keep going by "ID num.". There are a lot of columns with different ID no. and I want to ask the user what they want to keep and delete the rest.
The problem is I need to delete all the other columns the user didn't want but I still need to keep the first 6 columns and the last two columns as that is different information.
Here is what I have so far:
Sub Select()
'the below will take the users inputs
UserValue = InputBox("Give the ID no. to keep seperating with a comma e.g"12,13,14")
'the below will pass the user inputs to the example to split the values
Call Example(UserValue)
End Sub
Sub Example(UserValue)
TestColArray() = Split(UserValue, ",")
For Each TestCol In TestColArray()
' keep all the columns user wants the delete the rest except the first 6 columns and last 2
Next TestCol
End Sub
That is what I have so far, it is not much but the user could put in a lot of columns with different ID number in the input box the way the Excel sheet is laid out all the ID no.s are in row 2 and the first 6 and last 2 columns are blank of row 2 since the ID no. does not apply. I hope that helps.

try this (commented) code:
Option Explicit '<--| use this statament: at the cost of having to declare all used variable, your code will be much easier to debug and maintainable
Sub MySelect()
Dim UserValue As String
'the below will take the users inputs
UserValue = Application.InputBox("Give the ID no. to keep seperating with a comma e.g: ""12,13,14""", Type:=2) '<--| use Type:=2 to force a string input
'the below will pass the user inputs to the example to split the values
Example UserValue '<--| syntax 'Call Example(UserValue)' is old
End Sub
Sub Example(UserValue As String)
Dim TestCol As Variant
Dim cellsToKeep As String
Dim firstIDRng As Range, lastIDRng As Range, IDRng As Range, f As Range
Set firstIDRng = Range("A2").End(xlToRight) '<-- first ID cell
Set lastIDRng = Cells(2, Columns.Count).End(xlToLeft) '<-- last ID cell
Set IDRng = Range(firstIDRng, lastIDRng) '<--| IDs range
cellsToKeep = firstIDRng.Offset(, -6).Resize(, 6).Address(False, False) & "," '<--| initialize cells-to-keep addresses list with the first six blank cells at the left of first ID
For Each TestCol In Split(Replace(UserValue, " ", ""), ",") '<--| loop through passed ID's
Set f = IDRng.Find(what:=TestCol, LookIn:=xlValues, lookat:=xlWhole, MatchCase:=False) '<--| search for the current passed IDs range
If Not f Is Nothing Then cellsToKeep = cellsToKeep & f.Address(False, False) & "," '<--| if the current ID is found then update cells-to-keep addresses list
Next TestCol
cellsToKeep = cellsToKeep & lastIDRng.Offset(, 1).Resize(, 2).Address(False, False) '<--| finish cells-to-keep addresses list with the firts two blank cells at the right of last ID
Range(cellsToKeep).EntireColumn.Hidden = True '<-- hide columns-to-keep
ActiveSheet.UsedRange.EntireColumn.SpecialCells(xlCellTypeVisible).EntireColumn.Delete '<--| delete only visible rows
ActiveSheet.UsedRange.EntireColumn.Hidden = False '<-- unhide columns
End Sub
it's assumed to be working with currently active worksheet

A simple google search produces this. On the first page of results too. Perhaps this will suit your needs.
If the data set that needs to be deleted is really large (larger than the ranges you want to keep too.) Then perhaps only select the columns you want to have whilst you import the csv? This stackoverflow question shows how to import specific columns.
EDIT:
So from what I believe the OP is stating as the problem, there is a large csv file that is being imported into excel. After importing there is alot of redundant columns that should be deleted. My first thought would be to only import the needed data (columns) in the first place. This is possible via VBA by using the .TextToColumns method with the FieldInfo argument. As stated above, the stackoverflow question linked above provides a means of doing so.
If the selective importing is not an option, and you are still keen on making an inverse of the user selection. One option would be to create 2 ranges (one being the user selected Ranges and the second being the entire sheet), you could perform an intersect check between the two ranges and delete the range if there is no intersection present (ie. delete any cell that is not part of the users selection). This method is provided by the first link I supplied and is quite straight forward.

Related

Faster Workflow

I have a table (Table 1) with a whole bunch of well data (versions, MD, HD, etc.) and I want to create another table (Table 2) that will only show the data for the well I am interested in.
I have it set up where you select the well using a drop down list. Then I want Table 2 to be populated with four values for each of the iterations that show up in Table 1....
I tried using vlookup but was having issues when a well had multiple versions. And I also tried using an advanced filter.
Screenshot of the spreadsheet
Let's solve this using a helper column. First, assume column A will be used to the left of your table, to show the row number which each one of these is found in.
A5 would have the following formula:
=MATCH($C$1,K:K,0)
This shows us the row number that Well1 is first matched at. Then A6 and copied down would have the formula:
=A5+MATCH(B6,OFFSET(K1,A5,0,COUNT(M:M),1),0)
This uses OFFSET to create a new range, starting at the cell immediately below the previous match for Well1, and then uses MATCH to find what row that occurs.
So now, column A will always show the row number to pull data from. The rest is simply using the INDEX function to pull from your desired columns. For example, the data in column C pulls the iteration from column L, and can be pulled through formula like so, in cell C5 and copied to the right / down:
=INDEX(L:L,$A5)
If your data is appropriately normalized, you might be better off with a Pivot Table. This would give you the option of filtering by Well ID.
To use a Advanced filter you will need to create a worksheet event. Place this in the code for the sheet on which you want the data.
Private Sub Worksheet_Change(ByVal Target As Range)
If Not Intersect(Target, Range("A2")) Is Nothing Then
Dim dataRng As Range
Dim critRng As Range
Dim CpyToRng As Range
Dim cpytoarr() As Variant
With Worksheets("Sheet1")
Set dataRng = .Range(.Cells(1, 1), .Cells(1, 1).End(xlDown).End(xlToRight))
End With
With Me
.Range("CC1") = .Cells(1, 1).Value
.Range("CC2") = "'=" & .Cells(2, 1).Value
Set critRng = .Range("CC1:CC2")
Set CpyToRng = .Range(.Cells(6, 1), .Cells(6, 1).End(xlToRight))
End With
Debug.Print dataRng.Address
Debug.Print critRng.Address
Debug.Print CpyToRng.Address
dataRng.AdvancedFilter Action:=xlFilterCopy, _
CriteriaRange:=critRng, CopyToRange:=CpyToRng, _
Unique:=False
critRng.ClearContents
End If
End Sub
How this works. This assumes the data is on Sheet1 and starts in "A1" with no blanks in column A or the last row:
On Sheet2 set it up like this:
It is important that the header rows on sheet2 are name identical to the headers on sheet1.
Now every time that the value changes in A2 on sheet 2, your drop down, the requisite data will appear below row 6.

Manipulating Excel spreadsheet, removing rows based on values in a column and then removing more rows based on values in another column

I have a rather complicated problem.
I have a log file that when put into excel the column "I" contains event IDs, and the column J contains a custom key that keeps a particular even grouped.
All i want to do is remove any rows that do not contain the value of say 102 in the event id column.
And THEN i need to check the custom key (column J) and remove rows that are duplicates since any duplicates will falsely show other statistics i want.
I have gotten as far as being able to retrieve the values from the columns using com objects and .entirecolumn cell value etc, but I am completely stumped as to how i can piece together a solid way to remove rows. I could not figure out how to get the row for each value.
To give a bit more clarity this is my thought process on what i need to do:
If cell value in Column I does not = 102 Then delete the row that cell contains.
Repeat for all rows in spreadsheet.
And THEN-
Read every cell in column J and remove all rows containing duplicates based on the values in column J.
Save spreadsheet.
Can any kind persons help me?
Additional Info:
Column I holds a string that is an event id number e.g = 1029
Column J holds a string that is a mix of numbers and letters = 1ASER0X3NEX0S
Ellz, I do agree with Macro Man in that your tags are misleading and, more importantly, I did indeed need to know the details of Column J.
However, I got so sick of rude posts today and yours was polite and respectful so I've pasted some code below that will do the trick ... provided Column J can be a string (the details of which you haven't given us ... see what Macro Man's getting at?).
There are many ways to test for duplicates. One is to try and add a unique key to a collection and see if it throws an error. Many wouldn't like that philosophy but it seemed to be okay for you because it also gives you a collection of all the unique (ie remaining) keys in Column J.
Sub Delete102sAndDuplicates()
Dim ws As Worksheet
Dim uniques As Collection
Dim rng As Range
Dim rowPair As Range
Dim iCell As Range
Dim jCell As Range
Dim delRows As Range
Set ws = ThisWorkbook.Worksheets("Sheet1")
Set rng = Intersect(ws.UsedRange, ws.Range("I:J"))
Set uniques = New Collection
For Each rowPair In rng.Rows
Set iCell = rowPair.Cells(, 1)
Set jCell = rowPair.Cells(, 2)
On Error Resume Next
uniques.Add jCell.Value2, jCell.Text
If Err = 457 Or iCell.Value2 = 102 Then
On Error GoTo 0
If delRows Is Nothing Then
Set delRows = rowPair.EntireRow
Else
Set delRows = Union(delRows, rowPair.EntireRow)
End If
End If
Next
If Not delRows is Nothing then
MsgBox delRows.Address(False, False) & " deleted."
delRows.Delete
End If
End Sub
There are a number of ways in which this can be done, and which is best will depend on how frequently you perform this task and whether you want to have it fully automated. Since you've tagged your question with VBA I assume you'll be happy with a VBA-based answer:
Sub removeValues()
Range("I1").Select 'Start at the top of the I column
'We are going to go down the column until we hit an empty row
Do Until IsEmpty(ActiveCell.Value) = True
If ActiveCell.Value <> 102 Then
ActiveCell.EntireRow.Delete 'Then delete the row
Else
ActiveCell.Offset(1).Select 'Select the cell below
End If
Loop
'Now we have removed all non-102 values from the column, let`s remove the duplicates from the J column
Range("A:J").RemoveDuplicates Columns:=10, Header:=xlNo
End Sub
The key line there is Range("A:J").RemoveDuplicates. It will remove rows from the range you specify according to duplicates it finds in the column you specify. In that case, it will remove items from the A-J columns based on duplicates in column 10 (which is J). If your data extends beyond the J column, then you'll need to replace "A:J" with the appropriate range. Note that the Columns value is relative to the index of the first column, so while the J column is 10 when that range starts at A (1), it would be 2 for example if the range were only I:J. Does that make sense?
(Note: Using ActiveCell is not really best practice, but it's the method that most obviously translates to what you were trying to do and as it seems you're new to VBA I thought it would be the easiest to understand).

Advance AutoFilter to exclude certain values

I want to filter a large list of names in a Sheet in excel. In another sheet I have contained a list of names that I want to filter out and exclude from the larger list. How would I use the advanced filter to do this? I have tried this below but it is not seeming to work. My big list is in K2:K5000 and my criteria is in H2:H3 (The criteria will grow but I kept the list small for testing). Any help would be greatly appreciated!
Sub Filter()
Sheet5.Range("K2:K5000").AdvancedFilter Action:=xlFilterInPlace, _
CriteriaRange:=Sheets("Sheet3").Range("H2:H3"), Unique:=False
End Sub
To exclude the values in H2:H3 from K2:K5000 using advanced filter you can use following approach:
Make sure cell K1 is not empty (enter any header)
Find 2 unused cells (e.g. I1:I2)
Leave I1blank
Enter the following formula in I2
=ISNA(MATCH(K2,$H$2:$H$3,0))
Use the following code to exclude rows
Sheet5.Range("K1:K5000").AdvancedFilter Action:=xlFilterInPlace, _
CriteriaRange:= Sheets("Sheet3").Range ("I1:I2"), Unique:=False
I am not sure off the top of my head how you would use advanced filter to exclude, but you can use formulas in your advanced filter (near the bottom). You can, however, just use a dictionary to store values you want to exclude, then exclude (hide rows, or autofilter on the ones not found in your exclusion list)
Sub Filter()
Dim i as integer
Dim str as string
Dim dict As Object
Set dict = CreateObject("Scripting.Dictionary")
With Worksheets("Sheet3")
For i = 2 To 3
str = CStr(.Range("H" & i).Value)
If Not dict.exists(str) Then
dict.Add str, vbNullString
End If
Next i
End With
With Sheet5
For i = 2 To 5000
str = CStr(.Range("K" & i).Value)
If Len(str) > 0 And dict.exists(str) Then
.Range("K" & i).EntireRow.Hidden = True
Elseif
'alternatively, you can add those that aren't found
'to an array for autofilter
End if
Next i
End With
'If building autofilter array, apply filter here.
End Sub
Using AutoFilter:
Use an array of strings as criteria to filter on with the "Operator:=xlFilterValues" argument of AutoFilter. Build your array however you want, I chose to do it by building a string with a for loop and splitting (quick to write and test, but not ideal for a number of reasons).
Note: AutoFilter is applied to the headers, not data.
With Sheet5
.AutoFilterMode = False
.Range("K1").AutoFilter _
Field:=1, _
Criteria1:=arr, _
Operator:=xlFilterValues
End With
I think you need to understand first how to use the Advance filter.
There is a good tutorial you can find HERE.
Now based on that, let us make an example. Suppose you have below data:
Now, let us say you want to filter out Data1 and Data2.
According, to the link you can use a formula as criteria but:
Note: always place a formula in a new column. Do not use a column label or use a column label that is not in your data set. Create a relative reference to the first cell in the column (B6). The formula must evaluate to TRUE or FALSE.
So in our case, our relative reference is A11(the first cell or item in the field you want filtered). Now we make a formula in B2 since we cannot use A2, it is a Column Label. Enter the formula: =A11<>"Data1".
Above took care of Data1 but we need to filter out Data2 as well.
So we make another formula in C2 which is: =A11<>"Data2"
Once properly set up, you can now apply Advance Filter manually or programmatically. A code similar to yours is found below:
With Sheets("Sheet1")
.Range("A10:A20").AdvancedFilter xlFilterInPlace, .Range("A1:C2")
End With
And Hola! We have successfully filtered out Data1 and Data2.
Result:
It took me a while to get a hang of it as well but thanks to that link above, I manage to pull it of. I have learned something new as well today :-). HTH.
Additional:
I see that you have your criteria on another Sheet so you have to just use that in your formula. So if in our example you have Data1 and Data2 in H2:H3 in Sheet2, your formula in B2 and C2 is: =A11<>Sheet2!H2 and =A11<>Sheet2!H3 respectively.
You don't really even need VBA for this... to achieve the same result:
Put the values into a separate spreadsheet, in the first column.
Create 2 new columns next to the data you want to filter in your original spreadsheet
In the first column next to your data to be filtered, use
=VLOOKUP(A2, [nameOfOtherSpreadSheet.xlsx/xlsm/xls/etc]sheetName!$A:$A,1, FALSE)
Where A2 is the value you're searching for, field 2 is the reference of the range in which you want to search for this value, 1 is the index of the column in which you're searching, and FALSE tells VLOOKUP to only return exact matches.
In the second column next to the data you want to filter, use
=IFERROR(G2, FALSE)
Where G2 is the reference of the function that might return an error, and FALSE is the value you want to return if that function throws an error.
Filter the second column next to the data you want to filter for FALSEs
This should return the original data set without the values you wanted to exclude.
Record a macro to do this it's one step instead of 5 for future uses.

Sorting Worksheet data by column values using Excel VBA

I have next userform developed in vba, which takes info from a worksheet for displaying info
I want to order all the info aphabetically by a Segment, this is the code:
Function llenarDatosTabla()
Dim vList As Variant
Dim ws As Worksheet: Set ws = Worksheets(BD_PRODXSIST)
ListBox1.Clear
With ws
If (IsEmpty(.Range("AA2").Value) = False) Then
Dim ultimoRenglon As Long: ultimoRenglon = devolverUltimoRenglonDeColumna("A1", BD_PRODXSIST)
vList = ws.Range("AA2:AA" & ultimoRenglon & ":AL2").Value
If IsArray(vList) Then
Me.ListBox1.List = vList
Else
Me.ListBox1.AddItem (vList)
End If
End If
Me.ListBox1.ListIndex = -1
End With
Set vList = Nothing
Set ws = Nothing
End Function
how to make it ordered by 'AD' (SEGMENTO) column???
You can sort your Excel Worksheet in ascending order using VBA statement like the following:
Columns("A:XFD").Sort key1:=Range("AD:AD"), order1:=xlAscending, Header:=xlYes
Note: in the column range Columns("A:XFD") instead of XFD enter the last used column pertinent to your case, e.g. Columns("A:DD").
Hope this will help.
To sort a data table, use Excel Names in conjunction with the CurrentRegion function. This is less risky than hard-coding column references and can be done in two simple steps.
The reason it's preferable to specifying columns is that if you get the columns wrong or they change later, you'll scramble your data! When you perform the sort, the cells in any omitted column(s) will remain where they are, becoming part of the wrong rows. And this is exactly what will happen if you add further columns later, unless you remember to update your VBA.
Here are the two simple steps for using this approach. For this example, I've chosen a data table with four columns and four rows:
We are going to sort by COL3 descending. The cells in the other three columns share identical values, enabling us to readily verify they all stay with the correct rows.
Step 1: choose a cell in the data table that's unlikely to ever be removed, such as the header of a column you intend to make permanent, and define a Name for this cell. You can define the name by selecting the cell and typing directly in Excel's Name dropdown above the worksheet. Here I've used the name RegionTag:
Straight away, CurrentRegion can reference the whole data table just from this. You can see it in action if you code a line of VBA to select the table:
Range("RegionTag").CurrentRegion.Select
This is the result:
That's just for illustration, showing the power of the Name/CurrentRegion combination. We don't need to select the table in order to sort it.
Step 2: define a second Name, this time for the column you want to sort by:
Make sure the Name refers to the entire column, selected by clicking the column header, rather than just a range of cells in the column.
That's it! With these two Names defined, we can sort the data table without concerning ourselves with its rows and columns, even if more are added later:
Range("RegionTag").CurrentRegion.Sort _
key1:=Range("SortCol"), order1:=xlDescending, Header:=xlYes
Here is our data table sorted using the above statement:

Put entire column (each value in column) in an array?

So i'm making a macro to do a bunch of things. one thing is find duplicates of cells in sheet1 from sheet2. given columnA in sheet 1, do any values in columnB on sheet2 match any of the values in columna sheet1.
I know theres a remove duplicates, but I just want to mark them, not remove.
I was thinking something with the filtering. I know when you filter you can select multiple criteria, so if u have a column with 20 different values in it, you can select 5 values in the filter and it will show rows with those 5 values for the particular column. So i recorded a macro of that, and checked out the code, and I see for that it uses a string array, where each value to search for is in a string array. Is there any way to just specify an entire column and add every value to the string array?
thanks in advance
Here are three different ways to load items into an array. The first method is much faster but simply stores everything in the column. You have to be careful with this though because it creates a multidimensional array which isn't something that can be passed to AutoFilter.
Method 1:
Sub LoadArray()
Dim strArray As Variant
Dim TotalRows As Long
TotalRows = Rows(Rows.Count).End(xlUp).Row
strArray = Range(Cells(1, 1), Cells(TotalRows, 1)).Value
MsgBox "Loaded " & UBound(strArray) & " items!"
End Sub
Method 2:
Sub LoadArray2()
Dim strArray() As String
Dim TotalRows As Long
Dim i As Long
TotalRows = Rows(Rows.Count).End(xlUp).Row
ReDim strArray(1 To TotalRows)
For i = 1 To TotalRows
strArray(i) = Cells(i, 1).Value
Next
MsgBox "Loaded " & UBound(strArray) & " items!"
End Sub
if you know the values ahead of time and just want to list them in a variable you can assign a variant using Array()
Sub LoadArray3()
Dim strArray As Variant
strArray = Array("Value1", "Value2", "Value3", "Value4")
MsgBox "Loaded " & UBound(strArray) + 1 & " items!"
End Sub
not sure if anyone else will have this problem or not so I figured I'd post the answer I found. I like the solution of the array posted by #Ripster (and thanks for that, it almost worked) but it won't really work in this case. What I'm working with is a large sheet of data with 1 ID column, and I want to check other sheets to see if there are duplicates in that sheet (using ID column). not delete though, just mark so I can check them out. With potentially upwards of 50K rows looping through each row would take a LONG time.
So, what I figured out I can do is copy the ID column from the other sheet into the main sheet, and use the conditional formatting option to mark duplicates in some colour. (It'll mark the rows in both columns) and then I can filter the column by colour to show me only the colour I used to mark the duplicates. If I programmatically add a column to the sheet I'm checking with the row numbers, I can even include that column in the main sheet so when I filter for colour I can see which rows they were in their sheet.
After doing that I can record and adapt a macro to do this automatically for my less programming inclined co-workers
Thanks much all!
Edit - Added Code
After selecting the columns to compare, here is the code to mark the duplicates with red text and no fill:
Selection.FormatConditions.AddUniqueValues
Selection.FormatConditions(Selection.FormatConditions.Count).SetFirstPriority
Selection.FormatConditions(1).DupeUnique = xlDuplicate
With Selection.FormatConditions(1).Font
.Color = -16383844
.TintAndShade = 0
End With
Selection.FormatConditions(1).StopIfTrue = False
and then, since both columns have the duplicates marked you select the one that you actually want to examine and heres the code to filter:
`Selection.AutoFilter
ActiveSheet.Range("$C$1:$C$12").AutoFilter Field:=1, Criteria1:=RGB(156, 0 _
, 6), Operator:=xlFilterFontColor`
(in my test i used column c as the one to filter, that can be programmatically with a cells() reference or a range(cells(), cells()) sort of reference
I wish everyone the best of luck in their future endevors! thanks again to #ripster