Extract substring of list based on another list - vba

Using two lists, one consisting of names with added information in various forms (see below for example - list 1) and one consisting of the clear formatted names, i.e. with no added information (list 2)
List 1
--------
Netto City | Value
Imerco City | value
Bilka Suburb | value
Bauhaus, City | Value
City FDB Superb | Value
List 2
------
Netto
Imerco
Bilka
Bauhaus
FDB Super
What I am trying to do is create a filter, so that no matter what the first column of my source data(list 1) looks like, i will be able to sum the values based on (list 2).
Something similar to this: Excel - extracting data based on another list
I tried using vlookup, but that does not search for substrings, then i tried using
=IF(COUNTIF(A$4:A$9;"*"&D5&"*")>0;
INDIRECT(ADDRESS(MATCH("*"&D5&"*";A$4:A$9;0);4));"not found")
But that appears to do the opposite, search list 1 for a single cell value from list 2.
I can't quite get my head around if this works just as well, I havent been able to get it to work anyway, thus my search for the other way. Search List 2, for each item from List 1.
But, ultimately, what I am trying to accomplish is to create a list from the source data, which I can use to categorize each item in list 1 from, based on list 3
List 3
Bilka | Cat1
Imerco | Cat2
FDB Super | Cat1
etc.
For that to work, i need a clean list of the source data, without all the extra information which comes with it.
I use the following sumif
=SUMIFS($F$3:$F$703;$B$3:$B$703;
"="&$H4;$D$3:$D$703;">="&I$2;$D$3:$D$703;"<="&I$3)
to sum all sums belonging to a particular item in List 3 (where i've manually created List 3), between to dates.
The purpose of this is to create a sheet that contains all expenditures to a particular store or category of ones own choosing, for instance the ones listed in List 1, are primarily food stores.
Edit - Clarification.
What I am proposing to do is a multistage process.
Stage 1:
Insert original source data (done)
Stage 2:
Filter source data for unique values (done)
Stage 3:
Create list of approve names for each item in source data
- Ie, Bilka Suburb into Bilka, Netto City into Netto
Here 'Netto' and 'Bilka' are approved names which is manually created to allow for grouping in stage 4. I am looking to automatize this step.
Stage 4:
Group each item from the list of Stage 3, based on name and date-interval, weekly monthly whatever (done) if i could only get Stage 3 to work, as it works on my manually corrected data.
Stage 5:
Select appropriate category, and type for each item in resulting list from Stage 3:
Bilka, is a food place, so it would get the category 'food', same as netto, where Bauhaus would get the category 'Building Supplies', each of these items would get the type 'expense' where say wage would get the type 'income' (done)
the solution to stage 5, is just a vlookup, based on the category into a table that lists each category with a type, so that is simple enough.
Final Solution: Requires that the list to iterate over is in column G, and outputs the list of approved names in column H. There is the error of if not being able to know the difference between an item such as "Super" and "SU", I don't know how to fix that. If anyone has any suggestions on that I am all ears.
Sub LoopCells()
Sheets("RawData").Select
Sheets("RawData").Activate
LRApproved = Cells(Rows.Count, "H").End(xlUp).Row
LRsource = Cells(Rows.Count, "G").End(xlUp).Row
For Each approvedcell In Worksheets("RawData").Range("H2:H" & LRApproved).Cells 'Approved stores entered by users
For Each sourcecell In Worksheets("RawData").Range("G2:G" & LRsource).Cells 'items found from bank statement export
If InStr(UCase(sourcecell.Value), UCase(approvedcell.Value)) <> 0 Then
sourcecell.Offset(0, 2).Value = approvedcell.Value
End If
Next sourcecell
Next approvedcell
End Sub
Thanks for all the help.
Edit: Added final solution and VBA tag.

This works for me:
=SUM(B$3:B$7*NOT(ISERROR(SEARCH(A11,A$3:A$7))))
This assumes that your example list 1 is in range A3:B7 and your list 2 in A11:B15. Paste the above formula in cell B11 and press CtrlShift-Enter to enter it as an array formula. Then you can drag-copy it all the way down to B15.
Explanation: SEARCH for e.g. "Netto" in the cells of List 1. For cells that do not contain that string, SEARCH returns an error. So we're looking for cells that do not return an error. We now have an array of booleans indicating this. Multiply it element-by-element by the array of values. In this multiplication, TRUE is interpreted as 1 and FALSE as zero, so you're screening out the values that don't correspond to "Netto".
Here's a secreenshot of my setup:

Perhaps I've misunderstood but can't you use SUMIF?
=SUMIF(A$4:A$9;"*"&D5&"*";B$4:B$9)

instead of going with VBA, you can extract this with simple small formula. =Index(List2!A2:A10,Match(1,Countif(List1A2,""&List2!A2:A10&""),0)) (Press Ctrl+Shift+Enter). Assume you want to extract the list 2 in to list 1.

Related

Excel 2016: Conditional Formatting: Highlight row if cell value is in list

Excel 2016 Conditional Formatting question:
I have two tabs, Data and List:
Data has 5 columns. Column A is the item ID number, the others have
project related data.
List has 1 column. This is a list of ID numbers
that have been processed.
Here is the question:
How do I highlight the rows for processed ID numbers? I want to be able to add ID numbers to List as I process more rows. I want to see processed items in Data in green highlight, because green makes the boss happy!
Looking forward to your input!
I would use a VLOOKUP to find the value in the List sheet. An error means the value doesn't exist. Since you want to know if it does exist, just invert the boolean result with NOT
=NOT(ISERROR(VLOOKUP($A1,List!$A:$A,1,FALSE)))
Note: This is a Classic > Formula formatting rule, and the lookup value is $A1 because my "applies to" range starts on row 1 (and we always look at col A).
Rule:
Formatting Range:
Result:

VBA - How to refer to column names in Array?

To start, I have a scraping function that scrapes a table from a web page, and stores the data in a 2D array.
The 2D array starts from row 0 to however many rows of data are on the page, and columns 1 to however many columns there are.
Row 0 simply contains all the column names.
My ReDim:
ReDim Addresses(0 To lngTotalRecords, 1 To columns.Count) As String
I've then stored this 2D array into a scripting dictionary called dictClients, as there are multiple clients that all have their own entries for the same table on the web page.
So in my dictionary I have something like the following to refer to a particular address table for a particular client:
dictClients(1)("Addresses")
dictClients(2)("Addresses")
I now want to be able to check if a cell in a certain row contains a specific value, however the web page allows the columns to be reorganized so that:
dictClients(1)("Addresses")(1,1) 'row 1 column 1
will not always refer to the "Street Number" column. The street number column could be the following for someone else for example:
dictClients(1)("Addresses")(1,3) ' row 1 column 3
Given that these cells:
dictClients(1)("Addresses")(0,1) '(0,2) (0,3) etc.
all refer to the column's names, what's the best way for me to find the position of a particular named column?
Example: I want to get the value of the Postal Code cell in row 1, so I need to look in
dictClients(1)("Addresses")(1, POSTALCODECOLUMN),
which isn't always in the same position on the web page.
I was thinking of using the following function:
Public Function column(strArr() As String, strColumnName As String)
Dim i As Long
For i = 1 To UBound(strArr, 2)
If strArr(0, i) = strColumnName Then column = i
Next
End Function
But it just feels so lengthy calling it like:
strPostalCode =
dictClients(1)("Addresses")(1,column(dictClients(1)("Addresses"), "Postal Code")
Is there a better and easier way to do this?
Thanks.

Dynamic reference in excel formula

I have the following array formula which works for what I want to do but I'm trying to change the formula when a user selects a value.
=INDEX($A$2:$B$70,SMALL(IF($A$2:$B$70=$A$121,ROW($A$2:$B$70)),ROW(1:1))-1,1)
It's used for a monthly report and the user will choose from a drop down the day of the month, e.g 1,2,3 - 31.
So if the user selects 1 from the drop down menu I want the formula to use the above formula.
If they select 2 for example I want the formula to move over a column so it would change to
=INDEX($A$2:$C$70,SMALL(IF($A$2:$C$70=$A$121,ROW($A$2:$C$70)),ROW(1:1))-1,1)
and so on moving over a column at a time.
It this possible at all or can it even be done without VBA?
I have an example of what I want done on the following link
https://docs.google.com/spreadsheets/d/1MDOzoQxYLgW-UOyljZsMwSu8zyAB7O2k1V-bTNP5_F0/edit?usp=sharing
All the data is on the first tab called staff. Each employee has a row and the duty assigned under the corresponding day column.
On the Roster tab it summarises each day. So what I am trying to get to happen is when you choose the day of the month (or preferably the actual date) the sheet changes to reflect the data.
At the moment the code I have working does for just Day 1 because the column references are coded into the formula. I was hoping to somehow choose 6 for example from the drop down and then the formula will map chosen day to the corresponding range in the raw data and update the formula and change the formula from Staff!$A$2:$B$68 to Staff!$A$2:$G$68.
If the formula finds no more entries if shows #NUM! but I intended to use the function ISERROR() to replace #NUM! with "".
This is what I'm trying to achieve it if makes sense?
There are a few issues here/ You are returning the value from column A so the first range can be $A$2:$A$70 and that means you don't need the 1 to specify the column_num. The IF statement was covering A2:C70 when you really only want either B2:70 or C2:C70 depending on the 1 or 2.
Assuming that A122 has either a 1 or 2 in it then,
=INDEX($A$2:$A$70, SMALL(IF(INDEX($B$2:$C$70, 0, $A$122) = $A$121, ROW($1:$69)), ROW(1:1)))
Standard non-array alternative,
=INDEX($A$2:$A$70, SMALL(INDEX(ROW($1:$69)+(INDEX($B$2:$C$70, 0, $A$122) <> $A$121)*1E+99,, ), ROW(1:1)))

good method for approving mechanical turk participants?

Does anyone have a good method for approving people who took part in a survey on survey monkey, and who were recruited through mechanical turk?
I filter out people who did not pay attention during the survey by asking questions with obvious answers - if people get 'n' number of them wrong, I exclude them from payment.
After I downloaded .csv from mechanical turk, I paste two columns at the end of the .csv the MTurk Id and a 1 or 0 next to the name, indicating whether they will be paid or not.
How can I write a function that will search through the two columns containing MTurk ID's (the one that came in the .csv and the one that I pasted in) and then return whether the MTurk ID has a 1 or 0 next to it? his would make dis/approving so much easier.
I assume you are using a spreadsheet program since you mention "Adding two columns"? Why don't you just sort by the column with the one or zero in it to group the approved turk IDs together?
Here is how to accomplish this with a vlookup:
Assume you have your list of Turk IDs and the 1/0 approval code in columns A and B (A contains the Turk IDs and B contains a 1 or 0). Also assume you have the ID to test in column C and you are going to put the result of the vlookup test in column D:
A - Turk ID B - Approval C - ID to test D - Result
----------- ------------ -------------- ----------
1 ABC12345 0 DEF46253
2 ERF78878 1 HFH36251
3 HFH36251 1 ERF78878
4 DEF46253 0 ABC12345
Set the formula of cell D1 to =VLOOKUP(C1,$A$1:$B$4,2,FALSE)
Paste that into D2..D4 (obviously your list will be larger)
It will find the Turk ID in Col A and fill in the corresponding Approval value in Col D.
If you want to know what the arguments to the vlookup function are - the first is the value to look for (the ID you want to check), the second is the entire range of values to check (use the $'s in front of the cell references to make them absolute, so they don't change when you paste the formula into new cells), the third is the column of that range to pull (column 2 of the range is the approval number), the last argument is FALSE which forces an exact match of ID to ID).
Hope that helps.

Excel countif Pulling apart a cell to do different things

Excel 2007
I have a row of cells with variation of numbers and letters (which all mean something.. not random.)
It's basically a timesheet. If they take a sick day they put in S, if they take a partial sick day they put in PS. The problem is they also put in the hours they did work too. They put it in this format: (number)/PS.
Now if it were just letters I could just do =countif(range,"S") to keep track of how many s / ps cells there are. How would I keep track if they are PS where it also has a number separated by a slash then PS.... I also still need to be able to use that number to add to a total. Is it even possible or will I have to format things different to be able to keep track of all this stuff.
Assuming this is something like what your data looks like:
A B C D E
1 1 2 S 4/PS 8
...then you could do this:
1- add a column that just totals the "S" entries with a COUNTIF function.
2- add a hidden row beneath each real data row that will copy the numerical part of the PS entries only with this function in each column:
=IF(RIGHT(B1,2)="PS",IF(ISERROR(LEFT(B1,LEN(B1)-SEARCH("/",B1)-1)),"",INT(LEFT(B1,LEN(B1)-SEARCH("/",B1)-1))),"")
3- add another column to the right that just totals the "PS" entries by summing the hidden row from step 2.
3- add another column that totals everything by just summing the data row. that will ignore the text entries automagically.
4- have a grand total column that adds those three columns up
If you don't want to see the "S" and "PS" total columns, you can of course just hide them.
So in the end, the sheet would look like this:
A B C D E F G H I J
1 1 2 S 4/PS 8 1 4 11 16
2 4 <--- hidden row
HTH...
My quick take on this is:
pass the cell value into a CSTR function, so no matter what is entered you will be working with a string.
parse the information. Look for S, PS, or any other code you deem to be valid. Use Left or Right functions if you need to look at partial string.
check for number by testing the ascii value, or trying a CINT function, which will only work if the string can be converted to integer.
If you can show a sample of your cells with variation of numbers and letters I can give you more help. Hope this works out.
-- Mike