Excel formula/macro for finding data with letters and numbers - vba

I have a column of data in Excel that contains different values. I am looking for a formula or macro to distinguish the different types of data. For instance, I have a VLOOKUP for numerical values =VLOOKUP(E2,TECH!B:F,4,FALSE) but this only works for certain values.
For instance, this returns the value of E2 when it's listed as a 4 digit extension in the column. Some data points are listed as "i78990" or "n65778", etc. I want to return a value of "Chicago" when an "i" is before the number and an "Atlanta" if the "n" is before the number, etc.

Use "LEFT" function in order to get the as follows: LEFT(E2,1) to get the first letter (you can apply that to all of your cells), store the result on a different column and preform the Vlookup from there.
For a more general case of separating numbers from text you can use the following algorithm:
-Break the alphanumeric string into separate characters.
use: MID(A1,ROW($1:$9),1)
-Determine whether there is a number in the decomposed string.
use: ISNUMBER(1*MID(A1,ROW($1:$9),1))
-Determine the position of the number in the alphanumeric string.
use: MATCH(TRUE,ISNUMBER(1*MID(A1,ROW($1:$9),1)),0)
-Count the numbers in the alphanumeric string.
use =OUNT(1*MID(A1,ROW($1:$9),1))
or as a whole:
MID(A1,MATCH(TRUE,ISNUMBER(1*MID(A1,ROW($1:$9),1)),0),COUNT(1*MID(A1,ROW($1:$9),1)))
you can see an example in the following link.

Related

How to count occurrence of each sentence in Excel in this specific case?

The challenge here is the sentence is not split by cell. They are in the same column, but they might appear in the same cell. One sentence per line.
I need to count the occurrence of each sentence, for example, occurrence of "The cat is pink" is 2 and occurrence of "The dog is green" is 1.
I can also do Access 2016 if needed.
(Assuming you can split multi-sentence cells into multiple cells)
1) Split the cells with multiple sentences, you should be able to adapt this code to do this.
2) Make a copy of the column (elsewhere on the same sheet or in another sheet -- used column B in the same sheet),
3) Remove duplicate values for the copied column
4) Next to the column use the following array formula:
{=SUM(LEN(A$1:A$5)-LEN(SUBSTITUTE(A$1:A$5,B1,"")))/LEN(B1)}
(press <CTRL><SHIFT><ENTER> when entering an array formula)
sort column A --> a-z then
add a header to column A (to use in subtotals)
then from the Data tab, use "subtotals" using "count" as the function
=COUNTIF(A:A,"*Cat is Yellow*")
The same formula can be applied for all the rows.

Check if string contains a substring from a list Excel VBA

I have cells that contain account numbers in column A and strings that contain text and account numbers in column B.
[Data example]
I would like to create a list of accounts and then check if any of those accounts is contained in column B. If it is I want to extract this account number to column C (in the same row it was found). I am a VBA noob so I'm not sure how this could be done.
I asked a similar but much more complex question earlier this week, but this should be easier to "solve".
[This is how I would like it to look like after processing]
There might be other numbers with the same length as the account numbers in column B that are NOT account numbers, so this excludes some solutions.
In cell C1, use this formula and copy down:
=IF(A1="",INDEX($A$1:$A$16,MATCH(1,INDEX(COUNTIF(B1,"*"&$A$1:$A$16&"*")*($A$1:$A$16<>""),),0)),A1)

sum function in GREL in OpenRefine

In OpenRefine, I'm trying to increase the value of every number in a column by 1.
The GREL expression sum([value],1) gives me Error: sum expects an array of numbers.
I guess I don't know how to produce an array of numbers. When I use a different function on the same column, such as tan([value]), I get the result I want.
I think you misunderstood the use of sum(). If you just want to add 1 to each cell, just use value + 1.
Make sure, however, that your column contains numbers (in green) and not strings (in black). If in doubt, use toNumber(value) + 1 instead.
The sum() function allows to add all the numbers contained in an array, for instance sum([1,2,3,4]) = 10, but you have no array if each cell of your column contains a unique number.

Varying Format "Part Number" sort issue

(Current Sort Sample:)
2-1203-4
2-1206-3
2CM-
3-1610-1
3-999
…
AR3021-A-7802
AR3021-A-7802-1
B43570-
B43570-3
I am working on an 8000+ record parts list. The challenge I am running into is that different manufactures of the parts are using many varying formats for their part numbers. “Part Number” is the field I wish to sort my entire worksheet on. (There are about 10 columns of data in this worksheet.)
My methodology for attacking this challenge was to count the number of characters to the left of any “-“ and count the total number of numeric characters in the field. (I also set “Part Numbers” that started with a non-numeric character to a count value of 99 for both count calculations so those would sort after the numeric values.) From this, I was able to sort on the values to the left of the “-“ using .the MIN of the two counts. (My “Part Numbers” are in Column B and I have a header row which means that my first “Part Number” is in cell B2.)
This method worked up to a point. My challenge is that I need to subsequently sort values after the “-“ character as is illustrated by the erroneous sort of “3-1610-1” being followed by “3-999”
One of the limitations I see is that sorting with  Data  Sort only gives three columns to sort on. To sort on just the characters to the left of the “-“ is costing me those three columns. So, I am unable to repeat the whole process of counting values after the “-“ character and subsequently sorting with  Data  Sort after running the primary sort.
Has the sort of many differing formats of a field such as “Part Number” been solved? Is there a macro that can be applied to this challenge? If so, I would be grateful for your input.
This data is continuously updated with new part numbers so the goal here is to be able to add those additional part numbers to the bottom of the worksheet and use a macro to correctly resort the appended list.
For the record, I am not married to my approach. After all, it didn’t solve my challenge!
Thank you,
Darrell
Place this procedure in a standard code moule:
Public Sub PartNumberSortFormat()
Dim i&, j&, f, vIn, vOut
vIn = [b2:index(b:b,match("*",b:b,-1))]
vOut = vIn
For i = 1 To UBound(vIn)
f = Split(Replace(vIn(i, 1), " ", ""), "-")
For j = 0 To UBound(f)
If IsNumeric(f(j)) Then
f(j) = Format$(f(j), "000000")
Else
f(j) = String$(6 - Len(f(j)), "0") & f(j)
End If
Next
vOut(i, 1) = Join(f, "-")
Next
Columns(1).Insert xlToRight
[a1] = "SORT COLUMN"
[a2].Resize(UBound(vOut)) = vOut
Columns(1).EntireColumn.AutoFit
End Sub
After running the procedure, you will notice that it has inserted a new column A on your worksheet and your data has been scooted over to the right by one column.
This new column A will contain a copy of your part numbers, reformatted in such a fashion to allow normal sorting.
Now select all of the data INCLUDING this new column A and sort A-Z on column A.
After the sort, you may delete the new column A.
This works by padding all characters surrounding dashes to six zeroes.
My Thoughts:
Excel 2010 onwards lets you sort using as many columns as you like. (Not sure about 2007). Don't know which version you have!
You could use the formula SUBSTITUTE to remove all "-" from the part number then sort on the number that remains, which gives you a order more like the one you are wanting.
eg
Value =SUBSTITUTE(B2,"-","")
3-15 315
3-888 3888
3-999 3999
3-1610 31610
3-2610 32610
3-1610-1 316101
3-2610-3 326103
It's not exactly what you need though!
Combine this with other formulas (or a VBA function) to manipulate you part number to be more sortable.
You could use FIND to find the position of the first "-" and extract the numbers before it into one column.
Similarly using FIND, MID and LEN you could extract the numbers between a part number two "-".
I suspect if will be best to write a VBA function to convert a part number into a "sortable value". This might splitting the part number into it's component bits (ie each bit being the text between the "-")
(VBA function split might useful for this. It creates an array.
If you know the formats of ALL the part numbers that can be delivered, you can code accordingly.
I suspect you code will take a numbers like and convert them as shown
AB123-456-78 AB12300456007800
AB12-45-7 AB12000450007000
AB12-45 AB12000450000000
ie padding with zeros each component of the part number
The key to sorting the TEXTUAL values into the order you want is understanding how textuals values get sorted! Do some experiments. Then create zero (or "9") padded numbers that sort the numbers as you required.
I hope this helps.
While not a technical answer to the Excel question, I am a logistician working with extremely large data sets of part numbers - always varying in format. The standard approach used in my field is to "ignore" (remove) special characters from the P/N and append the (clean) P/N to the 5-digit CAGE (manufacturer) code to create a "unique" CAGE + (clean) P/N code for sorting, lookup, etc. Create a column for that construct.

Check the length of data in a cell and add a zero

I have a spreadsheet containing data in the following format:
Col1 Col2
ROW1: 21211 Customer 3873721
ROW2: 101111 Customer 2321422
ROW3: 91214 Customer 2834712
ROW4: 231014 Customer 3729123
I need to be able to create a macro that goes through each row and determines the number of characters that make up the row1 data.
For example:
If data contained in the first cell or ROW1 consisted of a total of 6
characters then this will remain the same. If it consisted of 5
characters then a zero needs to be added to the front of it.
I'm using Excel 2003.
VBA is not needed in this case. A simple IF Statement should do the trick. Assuming the column you want to evalutate is column A place this in column B and duplicate down the column:
=+IF(LEN(A1)=5,"0" & A1,A1)
This will work provided all values are 5 or 6 characters long as it appears in your sample data.
It seems OP may be happy with a Custom Format of:
#000000
This is easy to apply but mainly may be a good idea because most other ways of prepending a 0 are only possible by conversion of numeric values to strings (as otherwise Excel will automatically strip leading zeros). If that is done selectively (for length 5 but not length 6) a column of what appears to be numbers might end up with mixed types (of different behaviour) and so confuse or inconvenience.