Finding string and returning information from an adjacent cell - vba

Using Excel 2010. I need to see if a string in one set of cells exists in another set of cells, and if so, return information from the adjacent cell that matched that string. I had originally done this with SEARCH, ISNUMBER and nested IF statements, but my source data set has several dozen entries in it, and the strings that must be searched has several hundred entries. The data resembles that in the picture (a simplified example):
For a limited data set, I used nested IF statements, like:
IF(ISNUMBER(SEARCH($D$2,$A2,1)),"Cat Info",IF(ISNUMBER(SEARCH($D$3,$A2,1)),"Dog Info",IF(ISNUMBER(SEARCH($D$4,$A2,1)),"Elephant Info","Not Found")))
But now both sets of data are too large to do that on.
What I need to do is to search the strings in column A for the keyword in column D. If the keyword is found, I need to return the corresponding information from column E.
For example, in column B2, since the word dog is in A2, I would want the contents of E3 (Dog Section) to be displayed in B2.
My list of keywords are unique (Column D, List) and I know that zero or one keyword will appear in the string in Column A (TheString).
I think that INDEX & MATCH functions may be part of my solution, but I am unsure how to find which List keyword is in the string and then return the Information column value.

No need for VBA. This can be done with a simple formula:
Enter this formula in cell B2:
=LOOKUP(2,1/SEARCH(D$2:D$7,A2),E$2:E$7)
Copy downward as far as needed.
Note: adjust the range references to the size of your data.

Not done it all, but this formula
=MAX(IF(ISNUMBER(SEARCH(D1,$A$1:$A$4,1)),ROW($A$1:$A$4),0))
Array Formula
Will give you the row of the A1:a4 where D1 etc, is contained. Then you can index on that. However it will only show the max row, so if its in 1 & 2, then it will only show row 2.
Cheers.

Related

How do you select irregular duplicates with Google Sheets queries?

I have a Google sheet with 186,000 rows. I have included a dummy spreadsheet to give you an idea of the data. I need to select ALL duplicates, that includes rows where the first names might not match (i.e. Cathy vs Catherine), but they still refer to the same individual. There are also instances where the addresses might be slightly different (like omitting "Ave" in one row but including it in another).
I need to write a query to account for all of these instances, including just regular duplicates. Or I could do multiple queries and just copy the results into one spreadsheet. In any case, I'm at a loss.
Dummy spreadsheet. I have included one example of each case I am trying to account for (3 total).
I have something that may be useful. See my example sheet here:
https://docs.google.com/spreadsheets/d/19h28go-nzunW6zexcMD61QjySUKJA3Q2Ci2Hu3OMuAg/edit?usp=sharing
Basically I build a key value for each record, along the lines you asked for.
All of the last name, part of the first name, part of the address, and the ZIP code. Other variations are easily added.
The formula is just a string concatenation of parts of these fields, as follows:
=ArrayFormula(
IF(ROW(A2:A)=2,"DupeKey",
IF(A2:A<>"",A2:A &LEFT(B2:B,$N$1) &LEFT(G2:G,$N$1) &K2:K,"")))
A valuable option is to allow varying the length of the required matching sub-string, from the first name and address. This is controlled for the formula by selecting a substring length of 1 to 6 in cell N1, and seeing how this changes the duplicate records that are found. The shorter the substring length, the more duplicate (or possibly duplicate) records will be found.
Conditional formating is used to highlight the duplicate records.
And you can use the column filters to sort by different data columns - to put all of the duplicates at the top, sort by column N, in Z-A order, and exclude blanks.
Note that this isn't perfect. If someone accidentally types a space, or anything else, at the start of a data field, it will not be considered a duplicate. Better logic would be required to catch those.
Let me know if this helps.
You can use these formulas:
If cell B3 match "John" write "match", if doesn't match write "no"
=IF(REGEXMATCH(B3,"John"), "match", "no")
If cell F2 contains content of cell B3, write "match", if doesn't match write "no"
=IF(SEARCH(B3, F2)>0,"match","no")
References:
REGEXMATCH
SEARCH

Find first non-blank cell in column that meets criteria in another column

I've compiled multiple spreadsheets containing sporadic employee information, and I'm now trying to consolidate all of the information to remove duplicates and blanks. The formula below is my starting point, but if the first cell that meets that criteria is blank, it returns a blank. I want it to find the next cell that meets that criteria but has a value.
=INDEX(Working!C:C,MATCH($A3,Working!$B:$B,0))
Below is what the Working tab looks like, which contains the master list of data including blanks and duplicates. Working!C:C is the list of last names; $A3 is the Employee ID I'm hoping to retrieve data for, and Working!$B:$B is the list of Employee IDs. I'll be doing this for many columns, so to illustrate this, in the table example below I've shown that Column D is the phone number. Any help you can provide is appreciated!
Column B-------C-------D
---------287-----Doe----blank
---------287-----blank---333-333-3333
---------287-----Doe----blank
Use the following array formula:
=INDEX(Working!C$1:C$100,MATCH(1,($A3 = Working!$B$1:$B$100)*(Working!C$1:C$100<>""),0))
Being an array formula it needs to be confirmed with Ctrl-Shift-Enter instead of Enter when exiting edit mode. If done correctly then Excel will put {} around the formula.
Please note that with an array formula the references need to be the smallest range possible that covers the dataset.

Search and find in VBA

I need help for writing a macro to use in Excel. Essentially what I'm working with is large groups of financial data that are almost always unique. I need to be able to compare each cell in one column (B) to another entire column (H) to search for a matching value. After finding the matching value, I need the macro to print the value related to the original found value into the cell next to the searched value. (Column G will have titles referring to the value of cell H)
Basically I need to be able to compare the individual value of each cell in column B to the entire column H, and if and when there is a match, have the value of column G appear in column A next to the original cell in column B. Sorry for the terrible explanation.
Search Column H for value match of B2. If H30 = B30, print G30 in A2.
Since you can't use VLOOKUP(), you'll need to use a nested MATCH() and INDEX().
=INDEX(G:G,MATCH(B1,H:H,FALSE))
For reference:
INDEX()
MATCH()
Searching
You most likely do not need a macro for this. Try the VLOOKUP function: https://support.office.com/en-in/article/VLOOKUP-function-adceda66-30de-4f26-923b-7257939faa65
Edit:
Syntax should be something like:
=VLOOKUP(B1, G:H, 2, FALSE)
then fill down and adjust absolute references as appropriate for your spreadsheet.

fix for moving rows in google-spredsheet breaks array formula

I have an array formula in google sheets for an entire column, e.g. the following formula in C1
ArrayFormula(A1:A+B1:B)
And there is data in columns A and B.
If I were to grab a row and move it to another location. As soon as I move it the respective value in column C of that row is pasted as hard value and breaks the entire array formula.
Is there a way around this?
Unfortunately, there is no simple way around it. With arrays, the formula is usually tied with the positioning of the values as results vary according to the position of each value in an array. Hence, moving anything will result in the distortion of the formula.
The only (simple) way around it is to move your values by the cut-copy-paste method instead of dragging the whole row around. OR (For a more robust but complex implementation) write a script for Custom Functions in your sheet which will perform the necessary calculations and will not be affected when you move the values as it takes inputs from cell positions that have been pre-defined in the script.
Workaround found.
I was doing a manual sort which changed the row of my ARRAYFORMULA(), but has the same consequence as drag-n-dropping that "special row".
You will have to work with two sheets however.
Suppose that in your original data sheet (sheet 1), your have data on two columns (A and B), and you want to use ARRAYFORMULA() on column C, like in your example.
Leave sheet 1 "as is", create another sheet (sheet 2) and in top left cell type this:
={A1:B}
In sheet 2's column C, one cell below top (to leave room for header), enter:
=ARRAYFORMULA(A2:A+B2:B)
Then you can sort data as you wish in sheet 1, and ARRAYFORMULA() will always work in sheet 2 👍

Use Excel/OpenOffice cell names within drag-completition

I have a lot of measured values in each column. I use formulas under those values to calculate with them. I always edit the first column and drag-complete (small square in the south-east of the selected cell) to change the other columns, too.
It was fine while dealing with 5 values, but with 20 values in a formula, things are getting complicated. I would like to use cell names, as I found in Variable in Excel, but when I use drag-complete, this cells are not adapted for the next column, like $D$1 does instead of D1.
Ideas for solutions:
Perhaps I can declare an row of cells as an array and index it with cellname(row), but how is this possible?
Perhaps it is easier with a small vba script, but I would like to avoid this.
Thanks in advance.
Edit 1:
I was afraid that my question is not that clear. I will try to clearify it with the following files. Thus the Excel-Tag is removed, I uploaded an ods-File:
My file looks like the uploaded short example example.ods.
I created cell names in the second column like "size". Then I have put a human readable formula like "=size+step+thickness*weight" in C7. When I drag-complete it to cells D7 and E7 like shown in example.png. I get of course the same result as in C7, because the cell names are used as absolute names like $B$2 for example.
How can I have human readable formulas applied to D7 and E7 without editing D7 and E7 by hand? When I use for C7 "=C2+C3+C4*C5", I can use the drag-completition of course.
I hope this is more clear now. I guess this is some basic functionality, but I just don't know how to describe it well. Perhaps you have a similar idea to have it more readable than "=C2+C3+C4*C5".
This works in OpenOffice.org Calc as well as in LibreOffice Calc, but it's crucial to define the cell names for every column that will be evaluated by the formula. Here's a step-for-step solution, based on the example document:
Start with a spreadsheet containing just the values together with row and column heads:
Create the cell range names:
a. Select the data range including the column holding the row names (OOo will use those Strings as names in the next steps):
b. Select "Insert -> Names... -> Create":
c. Select "Left Column" to name the rows based on the content of the first column:
Result: four names, one for each row, named as desired:
Create the formula for the first data row (here incomplete, demonstrating OOo's tooltips):
Drag-complete for all other data rows, giving the final result (with Tools -> Detective -> Trace precedents activated - the detective points to the array's first column, but the formula will use the values of the current column):
You can use relative references in Names, it easier to use R1C1 mode for this:
Define a Name Size with a RefersTo of =R2C
Then wherever you use the name Size in a formula it will refer to the current column and row 2