Grabbing Top 5 Events From Multiple Columns in Excel - vba

My file has four sets of columns I want to grab the top 5 out of. There are four columns with names and four columns with total dollar amount in them. I assume the issue I am having it the MATCH() formula grabs the row number, however each row could have up to four dollars and names in it. Thus creating an #N/A error when I try it. The formula I am trying is:
=INDEX(Franchise,MATCH(I37,Totals,0))
The Franchise being the four columns of names and the Totals being the four columns of totals.
At this point I am stumped.
How would I go about creating that formula?
You can see where I want the formula and results to post at the top.
Here is the file.

You need to get the top 5 Totals and use those to retrieve the Qx (or other) label. The LARGE function does not like discontiguous cell ranges but the newer AGGREGATE¹ function has a LARGE subfunction (14) and you can force it to ignore errors with option 6. Forcing anything that isn't in column D, H, L or P into a #DIV/0! error will discard them from any calculation.
In M2 use this standard formula:
=AGGREGATE(14, 6, $D$10:$P$35/NOT(MOD(COLUMN($D:$P), 4)), ROW(1:1))
In K2 use this standard formula to retrieve the Qx label:
=IFERROR(INDEX($C$10:$C$35, MATCH(M2, $D$10:$D$35, 0)),
IFERROR(INDEX($G$10:$G$35, MATCH(M2, $H$10:$H$35, 0)),
IFERROR(INDEX($K$10:$K$35, MATCH(M2, $L$10:$L$35, 0)),
INDEX($O$10:$O$35, MATCH(M2, $P$10:$P$35, 0)))))
Fill K2:M2 down to K6:M6. Your results should resemble the following.
        
Caveat - If there are ties in the top 5 amounts, a more complicated formula would have to be devised to account for multiple events with identical totals.
¹ The AGGREGATE function was introduced with Excel 2010. It is not available in earlier versions.

Related

Extracting "hidden" data from expanding/collapsing pivot table - Excel

I'm not sure if this is possible but as you can see I have a pivot table with multiple dependent and expandable fields. I am trying to concatenate the data from columns A:D into one cell which works fine in row 2 but doesn't work with blank parent cells, as you can see in column F.
Any ideas for how to achieve this?
Pivot table
This answer assumes that you don't want to just Repeat All Item Labels in the PivotTable from the "Report Layout" drop-down on the Pivt Table Tools "Design" tab.
A formula to get the first non-blank value on or above the same row as the current cell from Column B can be constructed with a combination of AGGREGATE, SUMPRODUCT and OFFSET, like so:
=OFFSET($B2,SUMPRODUCT(AGGREGATE(14,6,ROW($B$1:$B$100)*--(ROW($B$1:$B$100)<=ROW())*--(LEN($B$1:$B$100)>0),1))-ROW(),0)
How does it work?
Starting with the outermost part, OFFSET($B2, VALUE, 0) - this will start in cell B2, then look up or down by VALUE rows to get the value.
Next we need to know how many rows we will need to look up-or-down. Now, if we can work out the bottom-most row with data, we can subtract the current ROW() from that, giving us OFFSET($B2, NON_BLANK-ROW(),0)
So, to finish up we need to work out which rows are not blank, AND which rows are on-or-above our current row, then take the largest of those. This is going to take an ArrayFormula, but we can use SUMPRODUCT to make that calculate properly. To find the largest number we could use MAX or LARGE - but we get less errors if we pick AGGREGATE(14,6,..,1). (The 14 means "we want the kth largest number", the 6 means "ignore error values", and the 1 is k - so "we want the largest number, ignoring errors")
But, what list of numbers are we going to look at, I don't hear you ask. Well, we want the ROW for output from our range (I'm using $B$1:$B$100, because using the whole column B would take far to long to calculate repeatedly), a comparison against the current ROW(), and check that the LENgth is > 0. Those last two are comparisons, so let's write them out first:
ROW($B$1:$B100)<=ROW()
and
LEN($B$1:$B$100)>0
We want to use -- to convert TRUE and FALSE to 1 and 0 - this means that any "bad" values become 0, and any "good" values are larger than 0:
ROW($B$1:$B$100)*--(ROW($B$1:$B$100)<=ROW())*--(LEN($B$1:$B$100)>0)
This gives us the Row number when the Row is on-or-before the current row AND Column B is not blank - if either of those are False, then we get 0 instead. Stick that in the AGGREGATE to find the largest number:
AGGREGATE(14, 6, ROW($B$1:$B$100)*--(ROW($B$1:$B$100)<=ROW())*--(LEN($B$1:$B$100)>0), 1)
Then put it in a SUMPRODUCT to force Excel to treat it as an ArrayFormula, and that's your NON_BLANK. This then gives you that first formula right at the top of the post

Merge values in multiple columns into one

I have the following data structure:
As you see in column J, I am trying to merge data into one column from columns A & C & E & G.
I am using this formula:
=IF(ROW()<=COUNTA($A:$A);INDEX($A:$C;ROW();COLUMN(A1));INDEX($A:$C;ROW()-COUNTA($A:$A)+1;COLUMN(C1)))
and I get the values in column K as you see. Currently this formula is merging only two columns. How to modify it to merge all four columns?
And how to only get those values starting from row 5?
The column height will vary constantly: sometimes there are 10 values in column A and sometimes there are 2 values.
Either any excel formula or any VBA code will be acceptable.
There is a fairly standard method for retrieving unique values from a column but not multiple columns. To achieve the retrieval from multiple columns you need to stack multiple formulas together with the processing being passed to successive columns one the earlier formula errors out.
      
The array formula¹ in J5 is,
=IFERROR(INDEX($A$5:$A$99, MATCH(0, IF(LEN($A$5:$A$99), COUNTIF(J$4:J4, $A$5:$A$99), 1), 0)),
IFERROR(INDEX($C$5:$C$99, MATCH(0, IF(LEN($C$5:$C$99), COUNTIF(J$4:J4, $C$5:$C$99), 1), 0)),
IFERROR(INDEX($E$5:$E$99, MATCH(0, IF(LEN($E$5:$E$99), COUNTIF(J$4:J4, $E$5:$E$99), 1), 0)),
IFERROR(INDEX($G$5:$G$99, MATCH(0, IF(LEN($G$5:$G$99), COUNTIF(J$4:J4, $G$5:$G$99), 1), 0)),
""))))
I have only included columns A, C, E and G as your sample data shows only duplicates in columns B, D, F, and H.
¹ Array formulas need to be finalized with Ctrl+Shift+Enter↵. If entered correctly, Excel with wrap the formula in braces (e.g. { and }). You do not type the braces in yourself. Once entered into the first cell correctly, they can be filled or copied down or right just like any other formula. Try and reduce your full-column references to ranges more closely representing the extents of your actual data. Array formulas chew up calculation cycles logarithmically so it is good practise to narrow the referenced ranges to a minimum. See Guidelines and examples of array formulas for more information.
This answer is another way of thinking about the formulas you could use for this sort of task. It gets to the point made by #Jeeped that it is difficult to find unique values in multiple columns. My first step then is to create a single column.
If you can live with a helper column, these formulas might be a tad easier to maintain than the nested IFERROR already proposed. They are equally difficult to understand though at first glance. The other upside is that it scales nicely if the number of columns involved increases.
It is possible using CHOOSE and some INDEX math to build a single column array of a group of separated columns. The trick is that CHOOSE will join discontinuous ranges side-by-side when given an array as the selecting parameter. If this starts with columns of the same size, you can then use division and mod math to turn it into a single column.
Picture of ranges shows the four groups of data with duplicates colored red.
Formula in F2:F31 is an array formula. This is combining all of the columns into an array and then back into a single column. I selected the columns out of order just to emphasize that it is handling a discontinuous range.
=INDEX(CHOOSE({1,2,3,4}, A2:A7,C2:C7,B2:B7,D2:D7), MOD(ROW(1:30)-1, ROWS(A2:A7))+1,INT((ROW(1:30)-1)/ROWS(A2:A7))+1)
The array formula in H2 and copied down is then the standard formula for unique values. The one exception is that instead of avoiding blanks like normal, I am avoiding 0 values.
=IFERROR(INDEX(F2:F31,MATCH(0,IF(F2:F31=0,1,COUNTIF($H$1:H1,F2:F31)),0)),"")
A couple of other comments about this approach:
In the CHOOSE, I am using {1,2,3,4}. This could be replaced with TRANSPOSE(ROWS(1:4)) or whatever number of columns you have.
There is also a ROWS(A2:A7) in 2 places, this could just be 2:7 or 1:6 or whatever size was used for the column size. I used one of the data ranges so that the coloring was simplified and to emphasize it needs to match the size of the block.
And the ROW(1:30) is used for the number of total items to collect. It really only needs to be 1:24 since there are 6*4 items, but I made it big while testing.
There are definitely a couple of downsides to this approach, but it may be a good trick to keep in the toolbox. Never know when you might want to make a column out of discontinuous ranges. The largest downside is that the columns of data all need to be the same size (and of course the helper column).
This code will do what you ask:
Sub MoveData()
START_ROW = 5
START_COL = 1
STEP_COL = 2
OUTPUT_ROW = 5
OUTPUT_COL = 10
Row = START_ROW
Col = START_COL
Out_Row = OUTPUT_ROW
While Col < OUTPUT_COL
While Cells(Row, Col).Value <> ""
Cells(Out_Row, OUTPUT_COL).Value = Cells(Row, Col).Value
Out_Row = Out_Row + 1
Row = Row + 1
Wend
Row = START_ROW
Col = Col + STEP_COL
Wend
End Sub
Think you guys are making this complicated. Just pull the range of data into power query , select all the columns and unpivot them this will bring all the data into a single column

AVG of Cells Next to Cells used in Another Formula

I am new to asking questions here so I hope I get this correct. I am helping my dad with a spreadsheet and I'm having issues with figuring out how to do one formula. Dont know if it can be done with a formula or if it has to be done with macros.
This is a scoring sheet with multiple matches. For each match there is a total score and the cell next to the score is an X count (number of bulleyes). In the same row (column K) I calculate the top 6 total scores and average them:
=AVERAGE(LARGE((N15,Q15,T15,W15,Z15,AC15,AF15,AI15,AL15,AO15,AR15,AU15,AX15,BA15,BD15,BG15,BJ15),{1,2,3,4,5,6}))
Now I need to take the AVG of the X counts that are next to the total scores that are used in the formula above and put solution in column L.
For example, if the cells that are used for AVG score in that row are:
N15,Q15,T15,W15,Z15,AC15
then the cells that would need to be used for the X count AVG would be:
O15,R15,U15,X15,AA15,AD15
This result would be put into L15
Please help. If any clarification is needed just let me know.
Screen Shot:
Please try the following formula:
=SUMPRODUCT(O15:BM15,
--(MOD(COLUMN(N15:BL15)-COLUMN($N15),3)=0),
--(N15:BL15+O15:BM15/10^3+COLUMN(N15:BL15)/10^6>=
LARGE(N15:BL15+O15:BM15/10^3+COLUMN(N15:BL15)/10^6,6))
)/6
How does it work?
SUMPRODUCT has 3 parameters - first is the array to sum, next 2 parameters return an array of 0 and 1 to choose only interesting elements of the first array.
MOD(COLUMN(N15:BL15)-COLUMN($N15),3)=0)
This part is included to avoid listing every single cell. If the score is in every third column of the input range, we can calculate column number relative to first column, and function MOD(column,3) returns: {1,0,0,1,0,0...}. So only every third column of input array will be included in sum.
(N15:BL15+O15:BM15/10^3+COLUMN(N15:BL15)/10^6>=
LARGE(N15:BL15+O15:BM15/10^3+COLUMN(N15:BL15)/10^6,6)
This part is to decide which 6 of the scores should be included in the final sum. The trickiest part is to decide what to do with ties. My approach is to take:
if two scores are the same, take the one with higher number of bulleyes
if it is still tied, take the one from first columns
This means that instead of N15 value we calculate:
N15+O15/10^3+COLUMN(N15)/10^6
With your sample data it evaluates to: 566.017014. First three decimal places is the number of bulleyes, next 3 is column number.
You can use the same formula to calculate average of top 6 scores by changing the first parameter:
=SUMPRODUCT(N15:BL15,
--(MOD(COLUMN(N15:BL15)-COLUMN($N15),3)=0),
--(N15:BL15+O15:BM15/10^3+COLUMN(N15:BL15)/10^6>=
LARGE(N15:BL15+O15:BM15/10^3+COLUMN(N15:BL15)/10^6,6))
)/6
You can try this not so elegant solution:
=SUMPRODUCT(INDEX(N15:BK15,MATCH(LARGE((N15,Q15,T15,W15,Z15,AC15,AF15,AI15,AL15,AO15,AR15,AU15,AX15,BA15,BD15,BG15,BJ15),{1,2,3,4,5,6}),N15:BK15,0)+1))/6
Entered as array formula by Ctr+Shift+Enter in Cell L15:M15 (2 cells) which should look like this:
{=SUMPRODUCT(INDEX(N15:BK15,MATCH(LARGE((N15,Q15,T15,W15,Z15,AC15,AF15,AI15,AL15,AO15,AR15,AU15,AX15,BA15,BD15,BG15,BJ15),{1,2,3,4,5,6}),N15:BK15,0)+1))/6}
with added braces.
The number 6 is the equates to the number of top scores you want returned.
Now, why 2 cells (L15:M15). I cannot make SUMPRODUCT evaluate the resulting array from the INDEX so we have to enter it at 2 cells. I don't think that would be a problem since in your screen shot, Column M is not used.
Note: If the range evaluated have less than 6 items, it will error out. Also good point by user3964075. It may or may not be able to deal with ties.

Trailing Average Using AverageIf in Excel

I am trying to find the average for the last 3 instances only. I am using the AVERAGEIF statement and it will calculate the average for the entire range but I need it to only calculate for that last 3 instances it finds (or less if there is less than 3 available). I need the entire column for G and H to have the average for the last 3 games that the Team played.
This is what I have:
=AVERAGEIF(B3:C17,B17,D3:E17)
You can do this with array formulas (They have to be entered using the keys Ctrl+Shift+Enter)...
Basic steps are:
Find the row (including and above current) that is the third highest row number containing the team name (or use row 1 otherwise)
Use the INDIRECT ranges in your AVERAGEIF from B-that_row to C-current_row and D_that_row to E-current_row
So in cell F17 you would have the formula
{=AVERAGEIF(INDIRECT("B"&LARGE(IF(--($B$3:B17=B17)+($C$3:C17=B17),ROW($B$3:B17),1),3)&":"&CELL("address",C17)),B17,INDIRECT("D"&LARGE(IF(--($B$3:B17=B17)+($C$3:C17=B17),ROW($B$3:B17),1),3)&":"&CELL("address",E17)))}
We repeat some of the logic, because we have two ranges (criteria range and average range).
IF(--($B$3:B17=B17)+($C$3:C17=B17),ROW($B$3:B17),1) means that if column B or (using +) column C has the value of in B17, give me the row number, otherwise 1 (our <3 case... we could make this 3, the first row of team names)
LARGE(...,3) will give us the third highest of this array --> the third highest row number having our team name
INDIRECT("B"&...&":"&CELL("address",C17)) is going to give us the range using our third highest row number to the current row, columns B and C
then we do exactly the same thing as you were doing in AVERAGEIF but using this INDIRECT range and the equivalent for columns D and E
Fun question! Good luck. And remember to use Ctrl+Shift+Enter to enter it!
EDIT The above was giving an #NUM! error for the first two rows - that was because the LARGE function was trying to get the third largest in an array of 2! Also noticed that there were some cases where the column letter needed to be absolute (i.e. $) for copying to the Away column. So the updated formula:
{=AVERAGEIF(INDIRECT("B"&LARGE(IF(--($B$3:$B17=B17)+($C$3:$C17=B17),ROW($B$3:$B17),1),MIN(3,ROW()-2))&":"&CELL("address",$C17)),B17,INDIRECT("D"&LARGE(IF(--($B$3:$B17=B17)+($C$3:$C17=B17),ROW($B$3:$B17),1),MIN(3,ROW()-2))&":"&CELL("address",$E17)))}
Replaced the 3 with MIN(3,ROW()-2) so that we get 3 if there are, but 1 or 2 if we are in one of the first two data rows
OK I posted this prematurely and attempted to delete it when I realised it wouldn't work. It should work now.... providing you add another condition which is the game dates in column A. Remember that this is an array formula so hit ctrl+shift+enter. Dates in column A; teams in column B; stats in column D. This formula can reside somewhere permanent on the sheet so you can enter the team name (shown as F13 here) to get the three most recent stats.
=AVERAGE(VLOOKUP(LARGE(IF(B3:B24=F13,A3:A24),1),A3:D24,4),VLOOKUP(LARGE(IF(B3:B24=F13,A3:A24),2),A3:D24,4),VLOOKUP(LARGE(IF(B3:B24=F13,A3:A24),3),A3:D24,4))

In excel, want to only sum certain values(not as easy as SUMIF)?

So I have two columns on named program and one with cost values. The three programs are ABC, A, B, and C. I want to sum the costs of all programs that contain A. All that contain B. And all that contain C. ABC clearly is included in all the sums. The problem is that to get just these programs the spreadsheet has a filter on it which messes sumif up. Can someone help? Here is an example of what I mean:
program cost
A 5.00
B 4.00
ABC 9.00
A 2.00
so I would want in three separate cells "sum with A"=16.00, "sum with B"=13.00, "sum with C"=9.00.
Item | Total
A | 16
B | 13
C | 9
Assuming your above range is in A1:B5, my first formula is the following Array formula:
{=SUM(IF(ISERROR(FIND(B6,$A$1:$A$5)),0,$B$1:$B$5))}
You create an Array formula by entering the formula and holding down the Ctrl+Shift keys while you hit Enter. In my solution, I've created an area where I calculate by totals and have a column (called Item in this case) which indicates the letter I see in the original A column.
If you were trying to enter this using VBA, you would use the FormulaArray property:
Selection.FormulaArray ="SUM(IF(ISERROR(FIND(B6,$A$1:$A$5)),0,$B$1:$B$5))"
Update
Restricting the calculation to only visible cells is a bit more complicated. Suppose we have your original data in cells A1:B5. Let's also suppose our test values start in cell C7 (diagonal to the source data). Our totals formula would look like:
=SUMPRODUCT(SUBTOTAL(3,OFFSET($B$1:$B$5,ROW($B$1:$B$5)-ROW($B$1),0,1)), --NOT(ISERROR(FIND(C7,$A$1:$A$5))), $B$1:$B$5)
The following portion returns a range over the cells
OFFSET($B$1:$B$5,ROW($B$1:$B$5)-ROW($B$1),0,1)
This portion returns 1 for each visible cell and 0 for invisible cell
SUBTOTAL(3,OFFSET($B$1:$B$5,ROW($B$1:$B$5)-ROW($B$1),0,1))
This portion is our criteria. NOT(ISERROR(... will return TRUE or FALSE. The double negative sign -- converts that value into a negative integer and then removes that negation.
--NOT(ISERROR(FIND(C7,$A$1:$A$5)))
Lastly, the SUMPRODUCT function multiplies the matching arrays to each other and executes the sum. The first two arrays return a series of 0's or 1's. If the row is both visible and matches our criteria, then we get 1*1 multipled by the given value in the cell. If the given cell is not visible or does not match the criteria, one of the two return a zero and it zeroes out the entire item.