VLookup multiple columns - vba

I am using VLookup function which looks up multiple values which are present in the column. This works very well but just takes a lot of time as I have 100,000 rows in the Excel sheet.
Is there any way to quicken this code?
The code basically looks up a particular value in a column and gets the offset. The difference between simple VLookup and this is that in case there are multiple rows with the same lookup value then it gets all the elements.
Function VLookupAll(ByVal lookup_value As String, _
ByVal lookup_column As Range, _
ByVal return_value_column As Long, _
Optional seperator As String = ", ") As String
Dim i As Long
Dim result As String
For i = 1 To lookup_column.Rows.Count
If Len(lookup_column(i, 1).Text) <> 0 Then
If lookup_column(i, 1).Text = lookup_value Then
result = result & (lookup_column(i).Offset(0, return_value_column).Text & seperator)
End If
End If
Next
If Len(result) <> 0 Then
result = Left(result, Len(result) - Len(seperator))
End If
VLookupAll = result
End Function

This is about 20-30x faster than a simple loop (tested over a column of 20k values, with 3 matches to the value being searched).
'rng: a single-column range to search for matches
'val: the value to match on
'col: offset from match in rng to concatenate values from (similar
' to the matching VLOOKUP argument)
Function MultiLookup(rng As Range, val As String, col As Long)
Dim i As Long, v, s
Dim r As Long
r = rng.Cells.Count
v = Application.Match(val, rng, 0)
s = ""
Do While Not IsError(v)
s = s & IIf(s <> "", ",", "") & rng.Cells(v).Offset(0, col - 1).Value
r = r - v
Set rng = rng.Offset(v, 0).Resize(r, 1)
v = Application.Match(val, rng, 0)
Loop
MultiLookup = s
End Function

http://www.excelhero.com/blog/2011/03/the-imposing-index.html says "Excel INDEX MATCH is significantly quicker than VLOOKUP"

You could try doing a Range.Find to see if the value exists at all in lookup column before proceeding. You are looping through every item in lookup column only to find it isn't there. If it were me, I would do a Range.find to see if lookup value is in lookup_column. If it is then you could do a countif to see how many occurrences there are...if there is only one occurrence, use plain old VLookup...and only fall back into your process if there is more than one occurrence.....may work....of course if Find fails, bail out of the function.
Another option is to load the lookup_column into any array...and process the array rather than the range.mnthat can sometimes help.

Summary:
Concate the values and do a vlookup on that new value.
For me I needed to have a formula and not a function to look up by 2 values. VLOOKUP could only work by a single value from what I've seen, so my solution was to concatenate the 2 values for a single primary key.
In my raw data tab I added a column called Lookup that simply concatenated the ID column with the Timestamp columns I had.
Then in my comparison tab I had
=VLOOKUP(CONCATENATE(A4, $F$1),'Historical Data'!$A:$G,3,FALSE)
Which took the ID column, concatenated with my lookup date at $F$1, and vlookup'ed into my data tab (Historical Data).

Related

UDF with Intersect runs slow

So I am creating a function to replace some manual index/match formulas. Note that this function works, but my problem is with speed. So I have a PivotTable with 6 columns and approx. 200.000 rows. I want this to find the value (and I don't use the pivotfunctions, meaning that this is just a table in pivot format) I found that this runs faster than having it in a regular data table. Both would be imported from a SQL table.
A single piece of this formula runs instantly, but the performance slows down when I have a few hundreds in the same sheet.
So any ideas on how to speed this up?
Function getnum2(ByVal Comp As String, Period As String, Measure As String, Optional BU As String, _
Optional Country As String, Optional Table As String, Optional TableSheet As String) As Double
Dim pTable As PivotTable, wTableSheet As Worksheet
If BU = "" Then
BU = "Group"
End If
If Country = "" Then
Country = "Total"
End If
If TableSheet = "" Then
Set wTableSheet = Worksheets("Data")
Else
Set wTableSheet = Worksheets(TableSheet)
End If
If Table = "" Then
Set pTable = wTableSheet.PivotTables("PivotTable1")
Else
Set pTable = wTableSheet.PivotTables(Table)
End If
'Find match
If Intersect(pTable.PivotFields("Bank").PivotItems(Comp).DataRange.EntireRow, _
pTable.PivotFields("Date").PivotItems(Period).DataRange.EntireRow, _
pTable.PivotFields("Business Unit").PivotItems(BU).DataRange.EntireRow, _
pTable.PivotFields("Country").PivotItems(Country).DataRange.EntireRow, _
pTable.PivotFields("Name").PivotItems(Measure).DataRange) Is Nothing Then
getnum2 = "No match"
ElseIf Intersect(pTable.PivotFields("Bank").PivotItems(Comp).DataRange.EntireRow, _
pTable.PivotFields("Date").PivotItems(Period).DataRange.EntireRow, _
pTable.PivotFields("Business Unit").PivotItems(BU).DataRange.EntireRow, _
pTable.PivotFields("Country").PivotItems(Country).DataRange.EntireRow, _
pTable.PivotFields("Name").PivotItems(Measure).DataRange).Count > 1 Then
getnum2 = "More than 1 match"
Else
getnum2 = Intersect(pTable.PivotFields("Bank").PivotItems(Comp).DataRange.EntireRow, _
pTable.PivotFields("Date").PivotItems(Period).DataRange.EntireRow, _
pTable.PivotFields("Business Unit").PivotItems(BU).DataRange.EntireRow, _
pTable.PivotFields("Country").PivotItems(Country).DataRange.EntireRow, _
pTable.PivotFields("Name").PivotItems(Measure).DataRange)
End If
End Function
Rather than calling the function three times, you could use a variable:
Function getnum2(ByVal Comp As String, Period As String, Measure As String, Optional BU As String, _
Optional Country As String, Optional Table As String, Optional TableSheet As String) As Double
Dim pTable As PivotTable, wTableSheet As Worksheet
Dim rgResult as Range
If BU = "" Then
BU = "Group"
End If
If Country = "" Then
Country = "Total"
End If
If TableSheet = "" Then
Set wTableSheet = Worksheets("Data")
Else
Set wTableSheet = Worksheets(TableSheet)
End If
If Table = "" Then
Set pTable = wTableSheet.PivotTables("PivotTable1")
Else
Set pTable = wTableSheet.PivotTables(Table)
End If
'Find match
Set rgResult = Intersect(pTable.PivotFields("Bank").PivotItems(Comp).DataRange.EntireRow, _
pTable.PivotFields("Date").PivotItems(Period).DataRange.EntireRow, _
pTable.PivotFields("Business Unit").PivotItems(BU).DataRange.EntireRow, _
pTable.PivotFields("Country").PivotItems(Country).DataRange.EntireRow, _
pTable.PivotFields("Name").PivotItems(Measure).DataRange)
if rgResult Is Nothing Then
getnum2 = "No match"
ElseIf rgResult.Count > 1 Then
getnum2 = "More than 1 match"
Else
getnum2 = rgResult.Value
End If
End Function
One very simple way to achieve this is by using two PivotTables.
In PivotTable 1, put all fields but the numeric one you want to
return in the ROWS area, and put the field that you want to return in
the VALUES area with aggregation set to COUNT.
In PivotTable 2, put all fields but the numeric one you want to
return in the ROWS area, and put the field that you want to return in
the VALUES area with aggregation set to SUM or MIN or MAX (It doesn't
matter which).
Then you can use a paramatized GETPIVOTDATA function to check PivotTable 1 to see if the thing you're looking up is unique (i.e. COUNT = 1) and if so, then look up the SUM/MIN/MAX of that item in PivotTable2. Given the item is unique, then the SUM/MIN/MAX is only operating on one number, and so does nothing to it.
Here's how that looks, using simplified data:
I've added conditional formatting to the two Pivots to highlight multiple occurances where we want to return the text 'Multiple Items', and as you can see, the formula that is populating column 6 of the Lookup table is only returning unique items as per your requirements.
Here's the formula, using Table notation as my Lookup range has been turned into an Excel Table:
=IF(GETPIVOTDATA("6",$A$3,"1",[#1],"2",[#2],"3",[#3],"4",[#4],"5",[#5])=1,GETPIVOTDATA("6",$H$3,"1",[#1],"2",[#2],"3",[#3],"4",[#4],"5",[#5]),"Multiple Items")
If I randomise the input cells in the Lookup table, you can see what happens when some items aren't in the PivotTable:
This approach works because the field you want to return is a numeric one, meaning you can add it to the VALUES pane of the Pivot. But you could still use this to return strings, by adding a unique ID to the source data, such as the row number, and putting that in the VALUES field, then retrieving it with the double GETPIVOTDATA lookup and using it to retrieve the associated string in the source data.
Another approach is to simply concatenate your columns into a primary key using a suitable delimiter such as the pipe character | and then use that as your lookup key. If you did a binary search on sorted data, this would be lightning fast. (I discuss this at http://dailydoseofexcel.com/archives/2015/04/23/how-much-faster-is-the-double-vlookup-trick/ ). The down side is that you wouldn't be warned in the event that there were multiple items. But it would be possible to do a second lookup using the match position returned by the first, to see if you get another result, and if so then return "Multiple Items". This would be super-fast.
Here's the fastest way to do this: using Binary Match on a sorted lookup table.
On the left I have 5 columns x 1048575 rows of random numbers between 1 and 10. These have been concatenated in column G to make a non-unique key, and then sorted ascending on that key.
(Because the concatenated key is text, it gets sorted alphabetical from left to right, which is why 1|1|1|1|10 appears between 1|1|1|1|1 and 1|1|1|1|2)
I gave the data in Column G the named range of Concat to simplify the formula. My lookup formula in J2 returns the row number of the lookup item if and only if that item is unique to the dataset. The formula is:
=IF(OR( AND(INDEX(Concat,MATCH(I2,Concat,1))=I2, MATCH(I2,Concat,1)=1), AND(INDEX(Concat,MATCH(I2,Concat,1))=I2,INDEX(Concat,MATCH(I2,Concat,1)-1)<>I2)),MATCH(I2,Concat,1),NA())
This executes in 0.01 milliseconds for one instance, for a lookup table of 1048576 rows. My double GETPIVOTDATA approach above took 6 milliseconds. So there you have it: a complex formula that gives a 600 times efficiency boost.
I can explain the formula later in need, but note that some of the complexity is due to the edge case where you may have a unique item appearing in row 1. If I leave out that edge case, then the formula is as follows:
=IF( AND(INDEX(Concat,MATCH(I3,Concat,1))=I3,INDEX(Concat,MATCH(I3,Concat,1)-1)<>I3),MATCH(I3,Concat,1),NA())

Excel VBA Function Lookover - Cooccurrences of two values

I've been working on a user-defined function in VBA to find a certain count. For background, "raw" is a worksheet that refers a sheet that has genres in column B and an artist ID in column C. The sheet that I'm working in has combinations of two genres: first in column A, second in column B.
Anyway, the function that I'm trying to build should do the following:
Take two inputs as strings. Check the genre column in "raw" for matches with the first input. Then, take that ID and find a cell that matches the ID and the second input. If there is one found, add one to a running count. Whether or not it is found, move onto the next match. The function will return an integer that indicates the number of times the two genres had the same artist ID.
Now, my function is returning #VALUE.. no syntax errors, no compiling errors. Just the error in returning the value. I've looked it over, googled like crazy, and I just can't figure it out. I'm new to VBA, so maybe I'm just missing something really obvious or I've defined something wrong. Either way, I just need another set of eyes to look over it. Any suggestions for improvement are much appreciated, so thank in advance for your time and help!!
Here is the code. I know it isn't the prettiest, but it's short and the logic should make sense.
Public Function cocount(c1 As String, c2 As String) As Integer
Dim rng As Range
Dim rng2 As Range
Dim cell As Range
Dim cell1 As Range
Dim ID As Integer
Dim Count As Integer
rng = Worksheets("Raw").Range("B2:B183579")
rng2 = Worksheets("Raw").Range("C2:C183579")
Count = 0
For Each cell In rng
If cell.Value = c1 Then
ID = cell.Offset(0, 1).Value
For Each cell1 In rng2
If cell1.Value = ID And cell1.Offset(0, -1).Value = c2 Then
Count = Count + 1
End If
Next cell1
End If
Next cell
cocount = Count
End Function
EDIT: Thanks for viewing my question and being willing to help (And thanks Rdster for attempting a solution). I uploaded pictures of the raw data and the combination, although it won't let me embed the images. Raw Data Combination List
Anyway, I'll try and explain my problem again. In the combination list of genres, each row contains two genres. I want to find how many times those two genres share the same artist ID in the Raw Data sheet. There are 181,000+ combinations, and 183,000+ rows in the Raw Data. Thus, the function needs to be efficient--something that I'm not incredibly great at doing even in other languages.
This can be achieved using built-in several different Excel Worksheet functions.
Excel Formula using COUNTIFS
Define 2 dynamic named ranges that will resize themselves to fit the data. Gendre_2 is defined relative to Gendre_1 this ensures that the ranges are the same size.
Gendre_1 = OFFSET(Raw!$A$1,1,0,COUNTA(Raw!$A:$A)-1,1)
Gendre_2 = OFFSET(Raw!$A$1,1,1,COUNTA(Raw!$A:$A)-1,1)
Formula
=COUNTIFS(Gendre_1,A2,Gendre_2,B2)
Reference: ExcelJet - Excel COUNTIFS Function
COUNTIFS counts the number of cells in a range that match supplied criteria. Unlike the COUNTIF function, COUNTIFS can apply more than one set of criteria, with more than one range. Ranges and criteria are applied in pairs, and only the first pair is required. For each additional criteria, you must supply another range/criteria pairs. Up to 127 range/criteria pairs are allowed.
VBA
Public Function cocount(c1 As String, c2 As String) As Double
Dim rng As Range, rng2 As Range
With Worksheets("Raw")
Set rng = .Range("A2", .Range("A" & .Rows.Count).End(xlUp))
Set rng2 = rng.Offset(0, 1)
cocount = WorksheetFunction.CountIfs(rng, c1, rng2, c2)
End With
End Function
If I understand correctly, you are trying to count the number of times that c1 and c2 = B# and C# where # is the same row.
Public Function cocount(c1 As String, c2 As String) As Integer
Dim Count As Integer, iRow as Integer
Count = 0
For iRow = 2 to Sheets("Raw").Cells(Rows.Count, "B").End(xlUp).Row
If Cells(iRow, "B") = c1 And Cells(iRow,"C") = c2 Then
Count = Count + 1
End If
Next iRow
cocount = Count
End Function
Check the genre column in "raw" for
matches with the first input. Then, take that ID and find a cell that
matches the ID and the second input.
try this:
Public Function cocount(FirstKey$, SecondKey$, FirstRng As Range, SecondRng As Range) As Long
Dim FirstAccurance As Range, ID$
Set FirstAccurance = FirstRng.Find(FirstKey, , xlValues, xlWhole, xlByRows, xlNext, 0)
ID = Cells(FirstAccurance.Row, SecondRng.Column).Value2
cocount = WorksheetFunction.CountIfs(SecondRng, ID, FirstRng, SecondKey)
End Function
test:

Excel VBA column B to single cell( coma separated ) , based on column A

data in excel:
1 Mary
1 John
1 Sam
2 Alaina
2 Edward
Result I expected was:
1 Mary, John, Sam
2 Alaina, Edward
User allquixotic answered a very similar question. Source here
This is his VBA approach:
Option Explicit
Function allquixotic(param As Variant, search As Range, values As Range, Optional absolute As Boolean = False) As String
Dim sep As String, retval As String
Dim i As Integer, rownum As Integer
Dim look As Range, j As Range
sep = ", "
retval = ""
For i = 1 To search.Rows.Count
Set look = search.Cells(i, 1)
If absolute Then
rownum = look.Row
Else
rownum = i
End If
If look.Value = param Then
If absolute Then
Set j = values.Worksheet.Cells(rownum, values.Column)
Else
Set j = values.Cells(i, 1)
End If
retval = IIf(retval = "", retval & j.Value, retval & sep & j.Value)
End If
Next
allquixotic = retval
End Function
Use the worksheet function (feel free to rename it) by using a formula like
=allquixotic(A1,$A$1:$A$15,$B$1:$B$15,true)
Use the fill handle to put the formula in all the cells
The parameters are the following:
=allquixotic(look_cell, key_range, value_range, absolute)
look_cell: The first parameter, should be a single cell or a value literal. Valid input includes things like 3, $6.25, "Hello", etc. This is the value that you are trying to find in key_range.
key_range: This should be a range of cells (more than one cell); if absolute is true then you will get very strange results unless this is a contiguous range (all the values are in sequential rows).
value_range: This should be a range of cells (more than one cell); if absolute is true then you will get very strange results unless this is a contiguous range (all the values are in sequential rows).
absolute: If true, then we will use the absolute row number (relative to the number of rows in the entire spreadsheet) of each "found" row in the key_range to determine what row to extract a value from value_range on. If false, we will use relative numbers; for instance, if we find a match in the third row of key_range, then we will extract the value from the third row of value_range. Recommended value is FALSE, or you can omit it to default to that.
Note: This function does not support the case where the key and value ranges are in columns, but it should be fairly easy to adapt it to that.
Also, if you specify multiple columns in either the key_range or the value_range, only the leftmost column will be used.
Again, all credit goes to allquixotic.

Range ID's changing due to cell selection

32bit Excel 2013 / Win 7 64 bit
UDF asks user for two range inputs from the same table and a lookup value ie:
Public Function FindBfromA(A as Range,B as Range, IDValue as Integer)
For IDCheck = 1 to A.Count
IF A(IDCheck) = IDValue then
IDNum = IDCheck
Exit For
End if
Next IDCheck
FindBfromA = B(IDNum)
End Function
Formula is added into another column of the table, for example
=FindBfromA([A],[B],[#C])
'Where C is calculated via something
My issues is Ranges A & B become disjointed. Where A(IDCheck) and B(IDCheck) should belong to corresponding columns in the same table row, based on where my cursor is when calculating begins Range [B] will re-key
This then causes the formula to return the wrong value from the FindBfromA=B(IDNum) as A(IDNum){Row} <> B(IDNum){Row}
I couldn't reproduce the error in the workbook I created with the false data - in my company (private) workbook the function operates essentially the same way, but captures two 'B' values given two IDs and passes them to another function.
It's difficult to be sure without seeing your range selections, but the unreliable element of your code is the cell references. By using a single integer index, you are basically selecting the nth cell in the range rather than cell on row n. My suspicion is that range B is offset from range A by a number of rows. Let's say your two selections were A = "A1:A10" and B = "B2:B11" then A(3), for example, would be on row 3 but B(3) would be on row 4. The same would apply if Range A had more than one column.
To eliminate that risk, refer to the ranges by the row and column indexes, as in the code below. You'll note I've also change the data type of the IDValue to a variant as this prevent an error being thrown in your IDValue should ever be something like a String or Long. I've also looped through range A with a For Each loop on each cell to cater for the case that range A has more than one column.
Public Function FindBfromA(A As Range, B As Range, IDValue As Variant) As Variant
Dim cell As Range
For Each cell In A.Cells
If cell.Value2 = IDValue Then
FindBfromA = B.Cells(cell.Row, 1).Value2
Exit Function
End If
Next
End Function

vb excel drag formula for variable number of rows

I have a excel sheet which I am populating using a VB program. The output sheet can have variable number of rows but has 6 columns (A:F). Now I want the column G to have hex2dec of all the rows in column A. Here's an example: Say column A has 400 rows (A1:A400) then I want G1:G400 to have values HEX2DEC(A1:A400). But this is just an example the rows can vary. I have this code so far:
Sub DataMod()
Dim i As Long, R3 As Long
R3 = 1
For i = 1 To sheet.UsedRange.Rows.Count
sheet.Cells(i, 7).Formula = "=HEX2DEC" & sheet.Cells(R3, 1)
R3 = R3 + 1
Next i
End Sub
But it's not working.
Review your HEX2DEC formula string
it doesn't include the necessary ()
the Cells() would return the value of the target cell, not its address (i.e. the result would be =HEX2DEC(1234) instead of =HEX2DEC(A1) - which may or may not be a problem
you could use variable i instead of R3, they both increment from the same starting point at the same increment
I recommend to use FormulaR1C1, you do not have variants there
Sub DataMod()
Dim C As Range
For Each C In ActiveSheet.UsedRange.Columns(1).Cells
C(1, 7).FormulaR1C1 = "=HEX2DEC(RC[-6])"
Next C
End Sub
The danger of UsedRange is that it might include any header rows, so you might want to get around this by selecting the input range manually before you fire your Sub() and work with the Selection object, e.g.
For Each C In Selection.Columns(1).Cells
Try This:
Sub DataMod()
' Get the number of rows used in Column A:
Dim NumRows as Long
NumRows = Range("A1").End(xlDown).Row
' Put the formulas in Column G all at once:
Range("G1:G" & NumRows).FormulaR1C1 = "=Hex2Dec(RC1)"
End Sub