Formula help to extract rows meeting multiple criteria - vba

I have a spreadsheet with over 600k rows. I need to extract data based on multiple criteria and grab only the latest change numbers of each.
So an item number may have multiple entries based on quarter start dates and desc codes because it's been revised several times in that quarter but I just want the most recent one (highest change number) and that row returned or marked in a new column to then filter out.
Hope that makes sense.
I have the following columns. Column A (Desc Code) which has 12 different codes in it, then Column B (Item Number several thousand), Column C (Period Begin, Start of the quarters dating back to 1998) and then a Column H (Change Number). I need to basically pull "Each" row containing the highest change number, for each Item Number in each Period it was available for each code.
So basically The change numbers vary depending on how many changes the Item Number had in the quarter.
And each time there was a change there is a change number for each Item Number for Each Desc Code (12 rows for each).
Thanks.

You lost me somewhere near paragraph 4 but let's simplify things. If you just had two columns -- Item Number and Change Number -- and you had a record for each change, you could just use Excel's subtotal feature: at each change in Item Number, show the MAX of Change Number.
Use the same logic for your situation. Create a new column that combines your "category" criteria (item & desc, or item & period, or whatever), sort by it, then subtotal against that new column and return MAX of Change Number.
Edit:
Item Period Change
100 1 1
100 1 2
100 1 3
100 2 1
100 2 2
I'm not sure if this is how your data looks but let's use it as an example (and's lets forget about Desc Code for now). If you want to find the latest change by item and period, create a new column by combining the Item and Period columns. For example, insert a column (C) and use the formula: =A2&"_"&B2. Now your data looks like this:
Item Period I&P Change
100 1 100_1 1
100 1 100_1 2
100 1 100_1 3
100 2 100_2 1
100 2 100_2 2
Now use Excel's subtotal feature (in the Data menu/ribbon, not the worksheet formula). Here's an example of what this looks like:
In your scenario, for the "At each change in" box, pick your new column (C), since that uniquely identifies the category you trying to identify (item AND period). "Use function" = Max. "Add subtotal to" = your Change Number column.
Click [OK] and Excel will add a new row with the maximum Change Number for each.

Related

Sum unordered data across multiple sheets

I think sumproduct sumifs indirect is what I need but I fail to see how to construct it;(
I have a workbook for logging volunteer hours.
I'm summing up the monthly hours (12 sheets/tables) to FY Totals sheet/table with hours by volunteer. So the sheet/table that I want to use the formula in is FY TOTALS
The workbook consists of 14 sheets:
sheet(VOLUNTEERS) has a table(tbl_volunteers) It contains data about the volunteer and the 1st 3 columns are duplicated on all 13 sheets (12 monthly and 1 FY totals)
A5[Status] B5[LastName] C5[Firstname]
The Month sheets have the above fields followed by 5 categories with hours per category for each volunteer
The sheet/table FY Totals is identical to Monthly, but I want the categories to sum all 12 months for each volunteer.
So I need to match criteria of [LastName][FirstName] and sum values in [category]D:I
I can send a copy of the file, but not load images here;(
You could use multiple consolidated ranges.
Notice that I have four different ranges. These could have been in different worksheets and the names aren't necessarily in the same order. A pivot table has been created that sums the hours worked for each person as requested. In this example, they just happen to be the same.
How?
Press Alt+D+P (windows) or cmd+alt+P (mac)
Select multiple consolidated ranges, then Next
Select Create a single page field for me, then Next
Add each of your ranges one by one, then Finish
Update
There's so much more you can do with PivotTables, as per your comments -- you can separate out data by creating your own page fields. Do this instead of step 3 above
Select I will create the page fields, then Next
For each range you select and add, click '1' page field and type the description of that data, for example 2016 Data, or 2017 Data
(Optional) Your fields will automatically be put into the report filter field. You might want to drag it into Row Labels field (below Row) to get the view below
You can then see the split of the different fields you used in your PivotTable - in yellow I have my 2016 data and you can see it's been split out in the PivotTable.

Removing duplicate values from a list if it meets conditions

I have been trying to write a for each loop to go through each row in in one sheet (sheet 2) to remove duplicates in another sheet (sheet 1). I have had no luck researching either.
In sheet 1, I have a list of customer numbers in column B with the type of product they purchased in column c and the cost of that product in column d. In another sheet 2, I have a list of customers in column a and list of products in column b.
I have been trying to write a for each loop to go through each row in sheet 2 to check the customer number and product, find all the duplicates in sheet 1 with the same customer number and product, and deleting the row with the higher balance.
Sheet 1
A(Year) B(Customer #) C(Product Type) D(Cost)
1) 2015 100 A 1
2) 2015 100 A 2
Sheet 2
A(Customer #) B(Product Type)
1) 100 A
For example, if sheet 2 had 100 in column a and A in column b, it would delete row 2.
You could try using the remove duplicates option within excel, would that solve your problem? Or is Sheet 2 update to remove certain customer orders that you would also like to remove.
Edit: To expand on this. Take the list, sort it by customer and cost (low to high). Then if you click remove duplicates, you'll have the option to select what columns to use as a basis for removing duplicates, so obviously remove cost, and then it'll clear out all but the topmost row, which as now been sorted to be the lowest value.

Trailing Average Using AverageIf in Excel

I am trying to find the average for the last 3 instances only. I am using the AVERAGEIF statement and it will calculate the average for the entire range but I need it to only calculate for that last 3 instances it finds (or less if there is less than 3 available). I need the entire column for G and H to have the average for the last 3 games that the Team played.
This is what I have:
=AVERAGEIF(B3:C17,B17,D3:E17)
You can do this with array formulas (They have to be entered using the keys Ctrl+Shift+Enter)...
Basic steps are:
Find the row (including and above current) that is the third highest row number containing the team name (or use row 1 otherwise)
Use the INDIRECT ranges in your AVERAGEIF from B-that_row to C-current_row and D_that_row to E-current_row
So in cell F17 you would have the formula
{=AVERAGEIF(INDIRECT("B"&LARGE(IF(--($B$3:B17=B17)+($C$3:C17=B17),ROW($B$3:B17),1),3)&":"&CELL("address",C17)),B17,INDIRECT("D"&LARGE(IF(--($B$3:B17=B17)+($C$3:C17=B17),ROW($B$3:B17),1),3)&":"&CELL("address",E17)))}
We repeat some of the logic, because we have two ranges (criteria range and average range).
IF(--($B$3:B17=B17)+($C$3:C17=B17),ROW($B$3:B17),1) means that if column B or (using +) column C has the value of in B17, give me the row number, otherwise 1 (our <3 case... we could make this 3, the first row of team names)
LARGE(...,3) will give us the third highest of this array --> the third highest row number having our team name
INDIRECT("B"&...&":"&CELL("address",C17)) is going to give us the range using our third highest row number to the current row, columns B and C
then we do exactly the same thing as you were doing in AVERAGEIF but using this INDIRECT range and the equivalent for columns D and E
Fun question! Good luck. And remember to use Ctrl+Shift+Enter to enter it!
EDIT The above was giving an #NUM! error for the first two rows - that was because the LARGE function was trying to get the third largest in an array of 2! Also noticed that there were some cases where the column letter needed to be absolute (i.e. $) for copying to the Away column. So the updated formula:
{=AVERAGEIF(INDIRECT("B"&LARGE(IF(--($B$3:$B17=B17)+($C$3:$C17=B17),ROW($B$3:$B17),1),MIN(3,ROW()-2))&":"&CELL("address",$C17)),B17,INDIRECT("D"&LARGE(IF(--($B$3:$B17=B17)+($C$3:$C17=B17),ROW($B$3:$B17),1),MIN(3,ROW()-2))&":"&CELL("address",$E17)))}
Replaced the 3 with MIN(3,ROW()-2) so that we get 3 if there are, but 1 or 2 if we are in one of the first two data rows
OK I posted this prematurely and attempted to delete it when I realised it wouldn't work. It should work now.... providing you add another condition which is the game dates in column A. Remember that this is an array formula so hit ctrl+shift+enter. Dates in column A; teams in column B; stats in column D. This formula can reside somewhere permanent on the sheet so you can enter the team name (shown as F13 here) to get the three most recent stats.
=AVERAGE(VLOOKUP(LARGE(IF(B3:B24=F13,A3:A24),1),A3:D24,4),VLOOKUP(LARGE(IF(B3:B24=F13,A3:A24),2),A3:D24,4),VLOOKUP(LARGE(IF(B3:B24=F13,A3:A24),3),A3:D24,4))

Divide rows and then transfer divided values in new worksheet (values: time periods & amounts)

Hope you can help with this:
In worksheet "TOTALS" and Range "M11:N251" there are "from:to" time periods inserted by user ("From" is in "M11:M" and "To" is in "N11:N"). Number of rows (and inserted time periods) may vary. Time periods are quadrimester (4months) but not always 120 days (could be 119, 122 or sth like that - if less than 100 then user must not be allowed to divide). Range "W11:W251" hosts amounts of money. So, for example, "M11:N11" can be 1/1/13-1/5/13 (dd/mm/yy) and W11 just a number (i.e. 98,45).
I want to be able to divide each time period almost by 2 (first part can be 60 days, second part rest of days) and amounts accordingly (depending on the number of days of its divided period) and transfer the divided periods and amounts to a new worksheet (TOTALS2) to -let's say- range "A11:B251" & "G11:G251").
So, in the above example, in "TOTALS2" we'll have 1/1/13-1/3/13 in "A11:B11" and 1/3/13-1/5/13 in "A12:B12", and the divided amounts accordingly to the counted days of "A11:B11" & "A12:B12" in "G11" and "G12".
And so on until there are no more time periods & amounts in TOTALS to divide and transfer to TOTALS2.
How can this be done? Ideas?
Thanks in advance!
If you want to check out the file, then download it:
http://www.sendspace.com/file/5kudcy
Based on the file you give in your link where the values in TOTALS start at row 10 (rather than 11 as stated), the prices are in column S (rather than W as stated) and the values in TOTALS2 are to appear in row 10 onwards (rather than row 11 onwards as stated) the following approach works:
Add 5 columns to worksheet TOTALS. These 5 columns have a header in row 9 and formulae in rows 10, 11, 12, etc. Using columns X, Y, Z and AA the header values in X9:AB9 are
Period, Int.To, Int.From, Proportion1, Proportion2
(where Int. is an abbrevation for intermediate.)
The formulae to use in cells X10 to AB10 are:
X10: =N10-M10+1 Calculates number of days from date M10 to date N10, inclusive.
Y10: =M10+59 Calculates end date of the 60 day period starting on date M10 (i.e calculates last date of first "half period")
Z10: =1+Y10 Calculates day after date in Y10 (start date of second "half period")
AA10: =60/X10 Proportion of whole period that is accounted for by first half-period
AB10: =1-AA10 Proportion of whole period that is accounted for by second half period
The formulae in X10:AB10 can be copied down for as many rows as contain data in worksheet TOTALS, to get something like...
The additional columns in TOTALS now provide the information that you need to split each quadrimester into two "halves". Cols M, N, Y and Z provide the date information and S, AA and AB the values for splitting each quadrimester's costs. To get it to display as desired in TOTALS2 you will also need to add a couple of columns to TOTALS2. I've used columns L and M with the following headers in L9 and M9:
Source, Part
In both L10 and L11 insert the value 1, whilst in M10 and M11 insert the values 1 and 2, respectively. Now add the following formulae to L12 and M12
L12: =1+L10
M12: =M10
Copy the formulae in L12 and M12 down so that columns L and M contain values in at least twice the number of rows (in TOTALS) that you wish to split into first half and second half periods. You should end up with the sequence 1,1,2,2,3,3,4,4 etc in column L and 1,2,1,2,1,2,1,2 etc in M.
The Source column (L) indicates which data row in TOTALS (1=first, 2=second, 3=third, etc) acts as the source of the values to be split in "half" and the Part column (M) indicates which "half" it is (1=first, 2=second).
All that remains is to put it all together using appropriate formulae in columns A, B and G of TOTALS2. The formulae to insert into A10, B10 and M10 are:
A10: =OFFSET(IF(M10=1,TOTALS!M$9,TOTALS!Z$9),L10,0)
B10: =OFFSET(IF(M10=1,TOTALS!Y$9,TOTALS!N$9),L10,0)
G10: =OFFSET(TOTALS!S$9,L10,0)*OFFSET(TOTALS!Z$9,TOTALS2!L10,M10)
Copy these formulae down the rows in TOTALS2 and it should be looking like...
The formulae in col A pick the start dates of the "half" periods using either col M or col Z from the correct Source row in TOTALS, according to whether the Part value is 1 or 2. Similarly, the formulae in col B pick the end dates using cols Y and N of TOTALS. The formulae in G multiply the cost value (col S of TOTALS) by the correct proportion from either col AA or col AB in TOTALS, again according to whether Part is 1 or 2.
I haven't included refinements such as:
preventing periods which are too short (or long) in TOTALS from being split (hint: you can detect this using the Length column in worksheet TOTALS) or
controlling the number of displayed rows in TOTALS2 so that it is exactly twice the number of data rows entered in TOTALS (not too difficult and several ways it can be approached) or
rounding the calculated costs for the two "halves" to 2 decimal places AND making sure that they add back to the original quadrimester cost (it is formatting in the worksheet that is causing only 2 decimals to be displayed and it is not guaranteed that the displayed values will sum precisely to their original source cost - the examples chosen just happen to have this property. Again not too difficult to solve.)
However, the above is a basic solution on which you can build.

Excel countif Pulling apart a cell to do different things

Excel 2007
I have a row of cells with variation of numbers and letters (which all mean something.. not random.)
It's basically a timesheet. If they take a sick day they put in S, if they take a partial sick day they put in PS. The problem is they also put in the hours they did work too. They put it in this format: (number)/PS.
Now if it were just letters I could just do =countif(range,"S") to keep track of how many s / ps cells there are. How would I keep track if they are PS where it also has a number separated by a slash then PS.... I also still need to be able to use that number to add to a total. Is it even possible or will I have to format things different to be able to keep track of all this stuff.
Assuming this is something like what your data looks like:
A B C D E
1 1 2 S 4/PS 8
...then you could do this:
1- add a column that just totals the "S" entries with a COUNTIF function.
2- add a hidden row beneath each real data row that will copy the numerical part of the PS entries only with this function in each column:
=IF(RIGHT(B1,2)="PS",IF(ISERROR(LEFT(B1,LEN(B1)-SEARCH("/",B1)-1)),"",INT(LEFT(B1,LEN(B1)-SEARCH("/",B1)-1))),"")
3- add another column to the right that just totals the "PS" entries by summing the hidden row from step 2.
3- add another column that totals everything by just summing the data row. that will ignore the text entries automagically.
4- have a grand total column that adds those three columns up
If you don't want to see the "S" and "PS" total columns, you can of course just hide them.
So in the end, the sheet would look like this:
A B C D E F G H I J
1 1 2 S 4/PS 8 1 4 11 16
2 4 <--- hidden row
HTH...
My quick take on this is:
pass the cell value into a CSTR function, so no matter what is entered you will be working with a string.
parse the information. Look for S, PS, or any other code you deem to be valid. Use Left or Right functions if you need to look at partial string.
check for number by testing the ascii value, or trying a CINT function, which will only work if the string can be converted to integer.
If you can show a sample of your cells with variation of numbers and letters I can give you more help. Hope this works out.
-- Mike