I have a large complex spreadsheet with ~200 tabs. It opens/loads very slowly (upto 5 minutes) due to multitudes of formula calculations. I am trying to optimize the formulas so the spreadsheet will open/load faster.
One of the most frequent calculations is to multiply about 60 cells in each tab by a variable in 'Sheet1!B4' (Sheet 1, cell B4). I expect this value to change maybe once in a year or so, which would require updating all 200 tabs in at least 60 cells each.
Will it be better to hard-code the value, and take the hit for updating it once a year in all the affected cells in all the tabs?
Or is it ok to reference it in some way, which does not impact the performance, and preferably makes it faster?
Here are the three options I am considering:
Hardcoded value: =countif(C$10:C$30,$B60) * 10
Reference cell: =countif(C$10:C$30,$B60) * Sheet1!:B4
Use Named Range of a single cell: =countif(C$10:C$30,$B60) * PARAMETER_VAL
where PARAMETER_VAL is a named range referencing Sheet1!B4
Which of the above would be the fastest?
Is there any other way to make it faster, that I may be missing?
I don't think that any of the three alternatives will have significative difference on the spreadsheet performance because the reference / named ranges point to a cell with a fixed value.
Related Q&A
Measurement of execution time of built-in functions for Spreadsheet
Related
I am working with a huge excel file that is updated with a set of macros. In the excel file there are also a large number of graphs to ensure easy output checks.
However, when I re-calculate the workbook it is extremely slow.
My question is: Do these graphs contribute to slowing down the calculation of the model? If so, is there a quick VBA way to only update graphs at the end of the overall calculation?
Without seeing your workbook this is hard to answer.
Most likely, it is not the charts (is that what you call "graphs") that are slowing down the recalc, but inefficient formulas.
Check the chart data sources. If they point to worksheet cells, then all is good. If they point to named ranges / named formulas, then check what these formulas are.
Recalculation is affected by
volatile formulas like Today(), Now(), Indirect(), Offset() and a few others
inefficient formulas that needlessly repeat calculations that have already been performed, typically done in running totals
And example of this would be
=Sum(A$1:A2) copied down, like in this screenshot
In each row, the calculation starts in row 1 and goes down to the current row. This is a waste of effort.
A much more efficient formula is in column C, where just the value from the row above is added to the value of the current row.
=SUM(C1,A2)
These details can make a heck of a difference.
For more information you may want to refer to Charles Williams' site http://www.decisionmodels.com/calcsecrets.htm and the pages linked from there.
It's a complex subject and can probably not be addressed in a simple answer to a seemingly simple question.
Could you please advise what would be the best way to create a union column for 12 separate columns (located in 12 different Excel sheets within a workbook) with or without VBA?
There are good manuals how to do it for two columns without VBA (using MATCH function) however I am not sure how to approach the case with multiple columns.
I think can be achieved with multiple consolidation ranges for a PivotTable. Would need labels for the columns and more than one column per sheet (could clone the existing ones). Should sort and remove duplicates from the list automatically (if cloned).
EDIT:
I'll assume your IDs are all numeric (otherwise, sorting would be very tricky if not impossible without VBA). You could modify the following array formula to meet your needs (select an area with enough rows to hold the full stack of IDs, enter the formula, then commit the formula with ctrl+shift+enter):
=SMALL(IFERROR(CHOOSE(COLUMN(INDIRECT("C1:C12",FALSE)),Sheet1!A1:A73,Sheet2!A1:A70,Sheet3!A1:A79,Sheet4!A1:A58,Sheet5!A1:A51,Sheet6!A1:A94,Sheet7!A1:A50,Sheet8!A1:A89,Sheet9!A1:A75,Sheet10!A1:A89,Sheet11!A1:A70,Sheet12!A1:A94),FALSE),ROW(INDIRECT("1:"&COUNT(Sheet1!A1:A73,Sheet2!A1:A70,Sheet3!A1:A79,Sheet4!A1:A58,Sheet5!A1:A51,Sheet6!A1:A94,Sheet7!A1:A50,Sheet8!A1:A89,Sheet9!A1:A75,Sheet10!A1:A89,Sheet11!A1:A70,Sheet12!A1:A94))))
I'll use a smaller version (2 columns) to explain how it works:
=SMALL(IFERROR(CHOOSE(COLUMN(A1:B1),A1:A73,C1:C70),FALSE),ROW(1:143))
First, COLUMN(A1:B1) returns a horizontal array of integers between 1 and 2. Passing this to the CHOOSE function with the two single-column ranges creates a single 73 x 2 array from both A1:A73 and C1:C70 (instead of creating a jagged array, the last three values of the second column will be filled in with #NA).
Wrap the result with IFERROR to convert the three #NA values to FALSE (otherwise, SMALL will return an error).
Next, ROW(1:143) returns a vertical array of integers between 1 and 143. Passing the 73 x 2 array and the array of integers between 1 and 143 to SMALL will return a single 143 x 1 array (vertical) of the sorted values (the three FALSE values are ignored).
Note on INDIRECT: Using INDIRECT in this way makes the formula stable even if rows/columns are deleted; however, it also makes the formula volatile, which will cause it to be recalculated every time there is a change in the workbook, which could slow things down considerably. Another option is INDEX (e.g., ROW(A1:INDEX(A:A,COUNT(...))), which can be affected by row/column deletions, but isn't volatile.
if you don't mind a bit of manual effort, this works for numeric and non numeric IDs:
Stack columns on top of each other manually using Ctrl-C + Ctrl-V
Go to Data tab --> Filter --> Advanced Filter --> tick unique records only --> choose your copy to location
This simple two step process would then give you unique union of two columns. Obviously the higher the number of columns, the more the utility of a VBA approach.
I have a excel sheet which we may keep adding rows/ deleting them.
And I have an average value present in some cell.I would want the excel formula to identify if there is text in another column to average the columns
So now if I insert another row, I have to manually update the average formula.
Is there a way to have a formula which check if column A is not empty, it should consider the data in column G for the average
There's a lot of approaches to this. My current favourite is a CELL:INDEX(...) expression. For instance, to find the last populated cell in the first continuously populated range between B1 and B5000, I would use (probably as a named range) $B$1:INDEX($B$1:$B$500,MATCH(TRUE, $B$1:$B$500="", 0)-1).
This approach is great because it's non volatile, so it shouldn't bog your worksheet down. It might be vulnerable to $B$500 gradually shrinking if you're only ever deleting rows, though. Alternatives are referencing the whole column ($C:$C), but that's usually dog slow in modern excel, or using OFFSET which never shrinks, but is volatile.
I have two workbooks with identical sheets and I need to test whether the data they are getting (from different sources) is identical or within a certain threshold. This, I am already able to do fine. I create a third workbook which calculates the difference between the two.
However, the issue is that one workbook updates seconds before the second which means that if a cell gets two quick updates my calculations would lag behind.
So what I was thinking is that I make a note of the cell value in workbook 1 (the faster updating workbook) and if at anytime up to x seconds after workbook 2 cell has the same value as noted, they are good.
...but how would I go about this, is VBA even the best tool for this?
Any ideas?
using timestamps and vba - or maybe even conditional formatting and some good formulas - could solve this for sure.
However, without anything to work on, it is quite a big step to present you a solution.
Basically you would just create timestamps (internally as variables, or during the import process in cells) and then compare your values, after your threshold ran out. If the values of your compared cells still match, then it's ok.
However, this is all quite dependant on how you solved your dataimput and comparison.
My basic feeling, after I red your question was: "why wouldn't you just decrease your update rate of your calculation?"
Another idea: just use an indicator cell, switching between 1 and 0 or so, to indicate, if an update already happend - so, if you compare, you compare value+indicator. which is basically using timestamps without time.
Excel 2007 is driving me nuts. I have several charts set up that I refresh periodically against some data in a SQL database. However, every time I refresh them, Excel bumps any absolute ranges I have for the charts to view this data by the amount of new records returned from the query.
The data is time related (I have a record for each minute of the day), and I want the chart (line chart) to be fixed in size... instead, Excel wants it to dynamically change, and thus it adds the extra rows to my named ranges.
Is there a way to prevent Excel from updating absolute named ranges when new data from a data source is returned? I hope this makes sense.
How are you defining your named ranges? I'm assuming your are just using cell references, if so it may be worth looking at the offset function, particularly if you are dealing with fixed size ranges (e.g. minutes in a day).
For example, if you had column headers in the first row, your minutes were in column A and you had exactly 1 row for each minute your offset formula would look like:
OFFSET(A2,0,0,1,1440)
and is defined as
OFFSET(StartCell, OffsetRows, OffsetCols, Width, Height)
Because any of the values within the arguments can also be formulas, OFFSET is a very useful function in defining ranges.