Better Excel Formula for Complex Lookup - excel-2007
I am trying to improve a complex lookup procedure that I have inherited. The lookup was being generated through several UDF combined with some standard worksheet functions. However, the issue was that when the user updated some data in the source sheet, the re-calc time was unacceptable.
So, I took a look and thought I may be able to write a better Excel formula only solution. Well, I did find a solution, but it is too much for Excel to handle on large data sets, and it crashes (understandably so!) when my VBA runs the formulas against the dataset.
Now, I could implement this in VBA fully, but then the user would have to press a button or something to update after every change. What I would like is a more simpler approach, if there is one, using some of the advanced Excel 2007 formulas. Since I am not as well-versed on those formulas, I am reaching out for some help!
Okay, here is what I have to work with.
SourceSheet
Tid's, Settlement Dates, and month-end prices (layer periods identified by 1,2,3, etc) in columns like below
Tid SettleDate 1 2 3 4 5 6 7 8 9 10 ... n
FormulaSheet
Amongst other columns, I have the following columns
InitLayer LiqdLayer InstrClass Tid SettleDate InitPrice LiqdPrice Position
I also have the layer numbers in columns to the right of the entire data set, like this:
1 2 3 4 5 ... n
What I need to do is fill in the proper price changes in these columns based on some logic in the dataset by looking up the prices on the source sheet.
In psuedo-formula, this is what I need to happen for each layer column in the FormulaSheet
If Layer < InitLayer OR Layer > LiqdLayer Then Return "-"
ElseIf Layer = InitLayer Then (Layered Price - InitPrice) * Position
where Layered Price is obtained by finding the Intersect of the LayerNumber
Column and Tid Row in the SourceSheet
ElseIf Layer = LiqdLayer Then Previous Layered Price * Position
where Previous Layered Price is obtained by finding the Intersect of the Previous
LayerNumber Column and Tid Row in the SourceSheet
Else (LayeredPrice - Previous Layered Price) * 6
where Layered Price and Previous Layered Price are defined as above
End If
I did come up with this formula, which works well on small data sets, but its toooooooooo big and nasty for large data sets, or just too big and nasty period!
=IF(OR(CH$3<$AT6,CH$3>$AU6),"-",IF($AT6=CH$3,(HLOOKUP(CH$3,layered_prices,RIGHT(ADDRESS(MATCH(IF($AX6="CUR",$T6 & " " & $G6,$T6),IF($AX6="CUR",layered_curtid,layered_tid),1),1,4),LEN(ADDRESS(MATCH(IF($AX6="CUR",$T6 & " " & $G6,$T6),IF($AX6="CUR",layered_curtid,layered_tid),1),1,4))-1)-1,FALSE)-$AV6)*$C6,IF($AU6=CH$3,($AW6-HLOOKUP(CG$3,layered_prices,RIGHT(ADDRESS(MATCH(IF($AX6="CUR",$T6 & " " & $G6,$T6),IF($AX6="CUR",layered_curtid,layered_tid),1),1,4),LEN(ADDRESS(MATCH(IF($AX6="CUR",$T6 & " " & $G6,$T6),IF($AX6="CUR",layered_curtid,layered_tid),1),1,4))-1)-1,FALSE))*$C6,(HLOOKUP(CH$3,layered_prices,RIGHT(ADDRESS(MATCH(IF($AX6="CUR",$T6 & " " & $G6,$T6),IF($AX6="CUR",layered_curtid,layered_tid),1),1,4),LEN(ADDRESS(MATCH(IF($AX6="CUR",$T6 & " " & $G6,$T6),IF($AX6="CUR",layered_curtid,layered_tid),1),1,4))-1)-1,FALSE)-HLOOKUP(CG$3,layered_prices,RIGHT(ADDRESS(MATCH(IF($AX6="CUR",$T6 & " " & $G6,$T6),IF($AX6="CUR",layered_curtid,layered_tid),1),1,4),LEN(ADDRESS(MATCH(IF($AX6="CUR",$T6 & " " & $G6,$T6),IF($AX6="CUR",layered_curtid,layered_tid),1),1,4))-1)-1,FALSE))*$C6)))
Formula Key
CH = Layer Number
CG = Previous Layer Number
AT = InitLayer
AU = LiqdLayer
AX = InstrClass (used to find a separate lookup for Currencies)
T = Tid
G = SettleDate (used to find a separate lookup for Currencies)
AV = InitPrice
AW = LiqPrice
C = Position
layered_prices = named range for the range of prices under the layer columns in SourceSheet
layered_tid = named range for tid rows in SourceSheet
layered_curtid = named range for currency tid rows in Source Sheet (just a separte lookup if InstrType = Currency, formula the same
Are there any other formulas, or combination of formulas that will allow me to get what I am seeking in a more efficient manner than the monstrosity I have created?
I agree with Kharoof's comment. You should break this formula into several columns. From my count, you need 4 more columns. The benefits are two-fold: 1) Your formula gets much shorter because you're not repeating the same function over and over and 2) You save memory because Excel will calculate it once instead of several times.
For instance, you call the exact same ADDRESS function four times. Excel doesn't "remember" what it was when evaluating a formula and so it calculates it anew each time. If you put it in it's own cell, then Excel will evaluate the cell before any cells that depend on it and store it as a value instead of the formula. When other cells reference it, Excel will provide the pre-evaluated result.
First, here's what your final formula should be: (The names in [brackets] indicate that a helper column fits there. It'll be some cell reference like CI$3 but I wasn't sure where you'd want to put it. You'll have to update those references based on where you add these columns.)
=IF(OR(CH$3<$AT6,CH$3>$AU6),"-",IF($AT6=CH$3,([LayerNumber]-$AV6)*$C6,IF($AU6=CH$3,($AW6-[PreviousLayerNumber])*$C6,([LayerNumber]-[PreviousLayerNumber])*$C6)))
And here are the four helper columns:
[ADDRESS] = ADDRESS(MATCH(IF($AX6="CUR",$T6 & " " & $G6,$T6),IF($AX6="CUR",layered_curtid,layered_tid),1),1,4)
[RIGHT] = RIGHT([ADDRESS],LEN([ADDRESS])-1)
[LayerNumber] = HLOOKUP(CH$3,layered_prices,[RIGHT]-1,FALSE)
[PreviousLayerNumber] = HLOOKUP(CG$3,layered_prices,[RIGHT]-1,FALSE)
By splitting it up, each step of the formula is easier to follow / debug as well as faster to process for Excel. If you want some quantitative improvement, the five formulas combined will be around 70% shorter than the single formula you have right now.
Related
Concatenating data descriptions and aligning to data range
I have been trying to learn visual basic to write code to perform tasks when working with data in excel. I have been mostly copying snippets of code I find online and piecing them together. Currently, I have folders containing 10's of thousands of .csv files (data output from a CMM). In each of these files column A consistently contains labels for the data and column B consistently contains the CMM data. Currently, my program allows a user to select multiple .csv files and in a long roundabout way they all end up on one worksheet in excel with the data labels in the first column and the data in the next columns. For example, if 10 CSV files are opened the data labels would be in the first column and the data would be in the next 10 columns. The problem is that that the data labels are not aligned with the data and often each row of data has multiple labels. I have been able to concatenate the data labels into one label for each row of data but cannot figure out how to align this label with the row of data. At this point, I would be happy with a separate block of code that accomplishes this but... I suspect that my block of code that concatenates the labels could be easily modified to accomplish the task, I just haven't been able to figure it out. So my code spits this out: (Flatness) : Item (113) Plane:RH_5_Mating_Surface 5.012 4.014 6.313 etc... (Z) : Item (128) / (X) : Item (135) Circle:Offset_Dowel_Hole 1.012 2.987 5.478 etc... Circle:Cast_Hole_From_Offset_Dowel_Hole 2.147 7.895 4.258 etc... Then this code concatenates the labels and spits them out in column B: Dim rng1 As Range Dim Lastrow As Long Dim c As Range Dim concat As String Lastrow = .UsedRange.Rows(.UsedRange.Rows.Count).Row Set rng1 = Range("A9:A" & Lastrow) concat = "" For Each c In rng1 If c > 0 Then concat = concat & " " & c.Value concat = Trim(concat) Else c.Offset(-1, 1).Value = concat concat = "" End If Next c The result is: (Flatness) : Item (113) Plane:RH_5_Mating_Surface 5.012 4.014 6.313 etc... (Z) : Item (128) / (X) : Item (135) Circle:Offset_Dowel_Hole 1.012 2.987 5.478 etc... Circle:Cast_Hole_From_Offset_Dowel_Hole 2.147 7.895 4.258 etc... What I need is: I cant figure out how to show it here but... I need the rows to match up, also note, here it shows that the data and labels are offset by the same amount but in reality they are not. So my thinking is that I need it to search for the next row containing data and put the label next to it. I feel like I can just change this part... Else c.Offset(-1, 1).Value = concat but I don't know how to do... I tried nesting another "For Each" here instead similar to what its already doing but with a "For Each d In rng2" where "rng2" was the data column and it would look for the next row with data and place "concat" next to the data using "d.offset(-1, -1).Value = concat" I couldn't figure out how to get it to work...
This is one possible way of doing it. Delete the empty cells and shift up in the columns. If the data order is consistent, they should line up correctly. For i=1 to Lastrow If Range("A" & i).Value="" Range("A" & i).Delete Shift:=xlUp End If If Range("B" & i).Value="" Range("B" & i).Delete Shift:=xlUp End If Next You can change the range according to your sheet. Also, modify the code to use another 'IF' condition to check blanks, if that works better for your case.
Variable number of terms in an Excel VBA Formula?
Is it possible to write a formula in VBA for excel such that there are "n" terms in the formula, with the number of terms changing as the value of "n" does? For instance, say you wanted to code cell a1 such that it was the sum of a2 and a3. Then you wanted b1 to be the sum of b2,b3,b4,b5 and so on such that each column 1 row 1 cell for a range of cells is the sum of "n" cells below it where "n" varies from column to column. Say that all cell addresses you wanted to use are known and stored in an array. Here is some code to better explain what I'm asking: For i = 0 to n Range(arr1(i)).formula = "=" & range(arr2(i)).value & "-(" _ & Range(arrk(i)).value & "+" & Range(arrk+1(i)).value & "+" _ & Range(arrk+2(i)).value & "+" & ... & ")" Next i So what I'm looking for is one piece of VBA code that can make a cell formula contain a dynamic number of terms. The code above would make cell a1's value = a-(b+c+d+...) where the number of terms in the bracket is variable, depending on which cell the formula is applied to. The image here shows an example of what I want to do. I'd like some code which could take "years income" and subtract a variable amount of "expenses" from it, where the number of expenses varies each year (but the number stays fixed for that year). The code needs to use a formula so that the expenses entries can be modified by the user.
Have you tried Array Formula ? : Array Formula : An Excel Array Formula performs multiple calculations on one or more sets of values (the 'array arguments') and returns one or more results. details : http://www.excelfunctions.net/Excel-Array-Formulas.html
Thanks for the suggestions everyone, I found a solution (not a particularly efficient one, but a solution nonetheless) to the conundrum today. First I created an array which used the "pattern" of the Junk cells to list every cell address which was to be included. Taking this array, I used a for loop to create a series of temporary arrays with the application.index command. For each temporary array, I used the Join command to turn the list of cells into a single string which I then inputted into a cell formula. Thanks to #thepiyush13 whose array.formula approach inspired this. Here's some example code to show what I did: ' hypothetical array containing two sets of cells to use Dim array1(0 To 1, 0 To 1) As Variant Dim vartemp As Variant Dim vartemptransposed As Variant ' col 1 will be used to add I10 and I13, col2 I11 and I14 array1(0, 0) = "$I$10" array1(1, 0) = "$I$13" array1(0, 1) = "$I$11" array1(1, 1) = "$I$14" For i = 1 to 2 'application.index(arr,row#,col#) to create a new array vartemp = Application.Index(array1, 0, i) 'error if not transposed vartemptransposed = Application.Transpose(vartemp) randomstring = Join(vartemptransposed, ",") totalvalue = 100 'example formula: a1 = totalvalue - sum(I10,I13). a2 = totalvalue - sum(I11,I14) Cells(1,i).formula = "=" & totalvalue & "-SUM(" & randomstring & ")" Next i I needed the code to run this many many times on large lists which are generated dynamically but always hold the same pattern of where the "junk cells" are. Not included in the code, but I also used another array for the cell addresses of where to place the formula.
Excel 2013 - Looking for a macro to write a sum for variable length sections
I'm trying to find a macro that will write a sum formula for a variable number of rows. I have a bunch of sections on a fairly extensive sheet. Each section has a column of costs, and each needs a sum at the end. However, the sections are variable in length, so I can't just write one sum and copy/paste it to each section. Sections are stacked like this: Section Headers 1 Data $ 2 Data $ 3 Data $ Sum? Section Headers 1 Data $ 2 Data $ 3 Data $ 4 Data $ 5 Data $ 6 Data $ 7 Data $ Sum? I have a few hundred of these to do, and I'd rather not manually write the sum formula each time. The lower boundary of the range I need summed is always the same relative to the formula cell, but the upper boundary is a random number of rows higher. Is there a way to have a macro auto detect how big the section is and make the sum only for that range?
This block of code assumes that each section heading starts with the word "Corrections" and that there are no blank rows between sections (so any blank cells will be filled with a sum formula). It also assumes the data is in column A. If your spreadsheet isn't setup like that, you could tweak this to work for your situation. Note that there isn't much in terms of error checking here, so you'll have to make sure your spreadsheet doesn't have any issues, or add error checks to the code (for example, if your spreadsheet doesn't start with a section heading, you may have an issue, and if you have sections with no values--that is a section header and then no data under it--you'll have an issue). Dim wksh as worksheet Dim last_row as long Dim curr_start as long set wksh = thisworkbook.worksheets("<your worksheet name>") last_row = wksh.cells(wksh.Rows.Count, "A").End(xlUp).Row for i = 1 to last_row + 1 if wksh.range("A"&i).value=0 or wksh.range("A"&i)="" then wksh.range("A"&i).formula = "=SUM(A" & curr_start & ":A" & i-1 & ")" elseif left(wksh.range("A"&i).value, 4)="Corr" then curr_start = i + 1 endif next i
Using .formula to sum values in dynamic ranges
I created a macro that changes the layout of a raw excel file and displays some usefull information to the user. But I am stuck now. I want to display the sum of some cells in a column, but the cell where the sum is and the column itself is set dynamically. My layout looks like this: Year1 Year2 Year3 Sum1 100 Sum2 50 48 Sum3 72 81 Sum4 26 ------------------------------------------ Total I want to display the sum of (sum1, sum2, sum3, sum4) in the Total row under each Year. The thing is my years are set dynamically depending on the value of a parameter (i.e. I can have 2, 3 or 5 years). I know I can do it using Range.WorksheetFunction.Sum but the problem is my values are subject to some changes after the macro has been used and if I use this function the values won't change afterward. That's why I want to use the function Range.Formula to display these values as I can enter the sum like I did for the values already in the sheet. But unlike these values (that are the sum of values in an unique column easy to select), I am not able to select the colum dynamically. Here is what I tried to give you an idea of what I want to do if it is unclear to you: For i=0 to Duration-1 Sheet2.Range("D" & RNumber + 7 + Duration).Offset(0,1+2*i).Formula="=SUM(Sheet2!" & Range("D" & RNumber + 6).Offset(0,1+2*i) & ":Sheet2!" & Range("D" & RNumber + 6 + Duration).Offset(0,1+2*i) & ")" Next i But it doesn't work as the sum doesn't recognize the range. I know I should have something like "=SUM(Sheet2!E" & Firstrow & ":Sheet2!E" & Lastrow & ")" but then i won't be able to select my colum dynamically.
It looks like your problem stems from the fact that you can't do things like "Sheet2!" & D+3+1 for columns the same way you can do "Sheet2!D" & 5+6+2 for rows. My opinion is that this is one of the many flaws created by the insane A1 reference style, and the best way to attack this problem at the root is to simply switch to R1C1 at the beginning of the macro. With Application If .ReferenceStyle = xlA1 Then .ReferenceStyle = xlR1C1 End If End With Some day, I plan to start a non-profit effort to rid the world of the mis-begotten $A4:B$3 craziness altogether. But too many people are used to A1. And if you send your boss or co-worker a spreadsheet with numbers for columns, she might get upset or confused. So the other solution is to introduce some ADDRESS() elements into your SUMs. The ADDRESS function takes only numbers (well, string for sheet reference) and will return a cell reference in whichever style you want. For i=0 to Duration-1 Sheet2.Cells(RNumber+7+Duration, 5+2*i).Formula = _ "=SUM(INDIRECT(ADDRESS(" & RNumber + 6 & "," & 5+(2*i) & ",1,TRUE,""Sheet2""),1):" _ & "INDIRECT(ADDRESS(" & RNumber + 6 + Duration & "," 5+(2*i) & _ ",1,TRUE,""Sheet2""),1))" Next i Or you can use Application.WorksheetFunction.Address() and avoid having the INDIRECT()s cluttering up your fomulas.
VBA Macro: Trying to code "if two cells are the same, then nothing, else shift rows down"
My Goal: To get all data about the same subject from multiple reports (already in the same spreadsheet) in the same row. Rambling Backstory: Every month I get a new datadump Excel spreadsheet with several reports of variable lengths side-by-side (across columns). Most of these reports have overlapping subjects, but not entirely. Fortunately, when they are talking about the same subject, it is noted by a number. This number tag is always the first column at the beginning of each report. However, because of the variable lengths of reports, the same subjects are not in the same rows. The columns with the numbers never shift (report1's numbers are always column A, report2's are always column G, etc) and numbers are always in ascending order. My Goal Solution: Since the columns with the ascending numbers do not change, I've been trying to write VBA code for a Macro that compares (for example) the number of the active datarow with from column A with Column G. If the number is the same, do nothing, else move all the data in that row (and under it) from columns G:J down a line. Then move on to the next datarow. I've tried: I've written several "For Each"s and a few loops with DataRow + 1 to and calling what I thought would make the comparisons, but they've all failed miserably. I can't tell if I'm just getting the syntax wrong or its a faulty concept. Also, none of my searches have turned up this problem or even parts of it I can maraud and cobble together. Although that may be more of a reflection of my googling skill :) Any and all help would be appreciated! Note: In case it's important, the columns have headers. I've just been using DataRow = Found.Row + 1 to circumvent. Additionally, I'm very new at this and self-taught, so please feel free to explain in great detail
I think I understand your objective and this should work. It doesn't use any of the methodology you were using as reading your explanation I had a good idea how to proceed. If it isn't what you are looking for my apologies. It starts at a predefined column (see FIRST_ROW constant) and goes row by row comparing the two cells (MAIN_COLUMN & CHILD_COLUMN). If MAIN_COLUMN < CHILD_COLUMN it pushes everything between SHIFT_START & SHIFT_END down one row. It continues until it hits an empty row. Sub AlignData() Const FIRST_ROW As Long = 2 ' So you can skip a header row, or multiple rows Const MAIN_COLUMN As Long = 1 ' this is your primary ID field Const CHILD_COLUMN As Long = 7 ' this is your alternate ID field (the one we want to push down) Const SHIFT_START As String = "G" ' the first column to push Const SHIFT_END As String = "O" ' the last column to push Dim row As Long row = FIRST_ROW Dim xs As Worksheet Set xs = ActiveSheet Dim im_done As Boolean im_done = False Do Until im_done If WorksheetFunction.CountA(xs.Rows(row)) = 0 Then im_done = True Else If xs.Cells(row, MAIN_COLUMN).Value < xs.Cells(row, CHILD_COLUMN).Value Then xs.Range(Cells(row, SHIFT_START), Cells(row, SHIFT_END)).Insert Shift:=xlDown Debug.Print "Pushed row: " & row & " down!" End If row = row + 1 End If Loop End Sub I modified the code to work as a macro. You should be able to create it right from the macro dialog and run it from there also. Just paste the code right in and make sure the Sub and End Sub lines don't get duplicated. It no longer accepts a worksheet name but instead runs against the currently active worksheet.