Extracting last name from a range having suffixes using VBA - vba

I have a list of full names in a column like for example:
Dave M. Butterworth
Dave M. Butterworth,II
H.F. Light jr
H.F. Light ,jr.
H.F. Light sr
Halle plumerey
The names are in a column. The initials are not limited to these only.
I want to extract the last name using a generic function. Is it possible?

Consider the following UDF:
Public Function LastName(sIn As String) As String
Dim Ka As Long, t As String
ary = Split(sIn, " ")
Ka = UBound(ary)
t = ary(Ka)
If t = "jr" Or t = ",jr" Or t = "sr" Or t = ",jr." Then
Ka = Ka - 1
End If
t = ary(Ka)
If InStr(1, t, ",") = 0 Then
LastName = t
Exit Function
End If
bry = Split(t, ",")
LastName = bry(LBound(bry))
End Function
NOTE:
You will have to expand this line:
If t = "jr" Or t = ",jr" Or t = "sr" Or t = ",jr." Then
to include all other initial sets you wish to exclude.You will also have to update this code to handle other exceptions as you find them !

Remove punctuation, split to an array and walk backwards until you find a string that does not match a lookup of ignorable monikers like "ii/jr/sr/dr".
You could also add a check to eliminate tokens based on their length.
Function LastName(name As String) As String
Dim parts() As String, i As Long
parts = Split(Trim$(Replace$(Replace$(name, ",", ""), ".", "")), " ")
For i = UBound(parts) To 0 Step -1
Select Case UCase$(parts(i))
Case "", "JR", "SR", "DR", "I", "II"
Case Else:
LastName = parts(i)
Exit Function
End Select
Next
End Function

Related

Function which Removes Only Non-ASCII characters in a column in access table

I have a access table and i am writing a vba code to remove non-ascii characters from the table, i have tried using below two functions
Public Function removeall(stringData As String) As String
Dim letter As Integer
Dim final As String
Dim i As Integer
For i = 1 To Len(stringData) 'loop thru each char in stringData
letter = Asc(Mid(stringData, i, 1)) 'find the char and assign asc value
Select Case letter 'Determine what type of char it is
Case Is < 91 And letter > 64 'is an upper case char
final = final & Chr(letter)
Case Is < 123 And letter > 96 'is an lower case char
final = final & Chr(letter)
Case Is = 32 'is a space
final = final & Chr(letter)
End Select
Next i
removeall = final
End Function
And also tried using below function
Public Function Clean(InString As String) As String
'-- Returns only printable characters from InString
Dim x As Integer
For x = 1 To Len(InString)
If Asc(Mid(InString, x, 1)) > 31 And Asc(Mid(InString, x, 1)) < 127 Then
Clean = Clean & Mid(InString, x, 1)
End If
Next x
End Function
But the problem is : In removeall function it removes everything including # and space characters.. And In Clean function also removes special characters as well.
I need a correct function which retains key board characters and removes all other characters
Examples of strings in tables are :
1) "ATTACHMENT FEEDING TUBE FITS 5-18 ºFR# "
2) "CATHETER FOLEY 3WAY SILI ELAST 20FR 30ML LATEXº"
Any help would be greatly appreciated
Output should be like
1) "ATTACHMENT FEEDING TUBE FITS 5-18 FR"
2) "CATHETER FOLEY 3WAY SILI ELAST 20FR 30ML LATEX"
One approach would be to use a whitelist of accepted characters. e.g.
' You can set up your domain specific list:
Const Whitelist = "1234567890" & _
"qwertyuiopasdfghjklzxcvbnm" & _
"QWERTYUIOPASDFGHJKLZXCVBNM" & _
" `~!##$%^&*()_-=+[]{};:""'|\<>?/ –"
Public Sub test()
Debug.Print Clean("ATTACHMENT FEEDING TUBE FITS 5-18 ºFR#")
Debug.Print Clean("CATHETER FOLEY 3WAY SILI ELAST 20FR 30ML LATEXº")
End Sub
Public Function isAllowed(char As String) As Boolean
isAllowed = InStr(1, Whitelist, char, vbBinaryCompare) > 0
End Function
Public Function Clean(dirty As String) As String
'-- Returns only printable characters from dirty
Dim x As Integer
Dim c As String
For x = 1 To Len(dirty)
c = Mid(dirty, x, 1)
If isAllowed(c) Then
Clean = Clean & c
End If
Next x
End Function
Alternate approach that preserves ALL ASCII characters, without working with a whitelist, in a single function:
Public Function RemoveNonASCII(str As String) As String
Dim i As Integer
For i = 1 To Len(str)
If AscW(Mid(str, i, 1)) < 127 Then 'It's an ASCII character
RemoveNonASCII = RemoveNonASCII & Mid(str, i, 1) 'Append it
End If
Next i
End Function

VBA: matching multiple strings

MY question would best be understood be the following example, my goal is to classify the following string into category if the string matches any one of the strings defined in those categories. For example,
dim test_str as string
test_str = "tomato"
If the test string tomato matches any one of the keywords (1) potato, (2) tomato and (3) spaghetti, then tomato will be classified as food.
I have a very inefficient way of doing this now, which involves using multiple strcomp, i.e.
if(strcomp(test_str, "potato", vbtextcompare) = 0 or _
strcomp(test_str, "tomato", vbtextcompare) =0 or _
strcomp(test_str, "spaghetti", vbtextcompare)=0 ) then
'label test str as "food"
However, if I have 10 keywords defined within "food", I would then need 10 strcomp statements, which would be tedious. Is there a better way to do this ?
I would simply store all the combinations in a string and check that the value is present with InStr:
Const food = "|potato|tomato|spaghetti|"
Dim test_str As String
test_str = "tomato"
If InStr(1, food, "|" & test_str & "|", vbTextCompare) Then
Debug.Print "food"
Else
Debug.Print "not food"
End If
Write a function that helps you
Function ArrayWordNotInText(textValue, arrayKeyword)
Dim i
ArrayWordNotInText = -1
For i = LBound(arrayKeyword) To UBound(arrayKeyword)
If Not StrComp(textValue, arrayKeyword(i), vbTextCompare) Then ArrayWordNotInText = i
Next i
End Function
If the return value = -1 ... no Match, >0 the index of the word
This is my first time posting; excuse my formatting. Have not been using VBA for too long but was able to piece this together.
Sub vinden4()
Dim EXCEPT() As String, a As Integer
EM = "no.replynoreply#ziggo.nl"
Exceptions = "no-Reply,noreply,nO.reply,"
EXCEPT = Split(Exceptions, ",")
For i = LBound(EXCEPT) To UBound(EXCEPT)
NOREPLY = InStr(1, EM, EXCEPT(i), vbTextCompare)
If NOREPLY > 0 Then
'CbEM.Value = True '~food~
EM = InputBox("NOREPLY E-MAILADRES", "Geef E-mailadres aan", EM)
'else
'CbEM.Value = False ~not food~
End If
Next i
MsgBox EM
End Sub
Hope this can help someone.

Excel VBA Custom Function Remove Words Appearing in One String From Another String

I am trying to remove words appearing in one string from a different string using a custom function. For instance:
A1:
the was why blue hat
A2:
the stranger wanted to know why his blue hat was turning orange
The ideal outcome in this example would be:
A3:
stranger wanted to know his turning orange
I need to have the cells in reference open to change so that they can be used in different situations.
The function will be used in a cell as:
=WORDREMOVE("cell with words needing remove", "cell with list of words being removed")
I have a list of 20,000 rows and have managed to find a custom function that can remove duplicate words (below) and thought there may be a way to manipulate it to accomplish this task.
Function REMOVEDUPEWORDS(txt As String, Optional delim As String = " ") As String
Dim x
'Updateby20140924
With CreateObject("Scripting.Dictionary")
.CompareMode = vbTextCompare
For Each x In Split(txt, delim)
If Trim(x) <> "" And Not .exists(Trim(x)) Then .Add Trim(x), Nothing
Next
If .Count > 0 Then REMOVEDUPEWORDS = Join(.keys, delim)
End With
End Function
If you can guarantee that your words in both strings will be separated by spaces (no comma, ellipses, etc), you could just Split() both strings then Filter() out the words:
Function WORDREMOVE(ByVal strText As String, strRemove As String) As String
Dim a, w
a = Split(strText) ' Start with all words in an array
For Each w In Split(strRemove)
a = Filter(a, w, False, vbTextCompare) ' Remove every word found
Next
WORDREMOVE = Join(a, " ") ' Recreate the string
End Function
You can also do this using Regular Expressions in VBA. The version below is case insensitive and assumes all words are separated only by space. If there is other punctuation, more examples would aid in crafting an appropriate solution:
Option Explicit
Function WordRemove(Str As String, RemoveWords As String) As String
Dim RE As Object
Set RE = CreateObject("vbscript.regexp")
With RE
.ignorecase = True
.Global = True
.Pattern = "(?:" & Join(Split(WorksheetFunction.Trim(RemoveWords)), "|") & ")\s*"
WordRemove = .Replace(Str, "")
End With
End Function
My example is certainly not the best code, but it should work
Function WORDREMOVE(FirstCell As String, SecondCell As String)
Dim FirstArgument As Variant, SecondArgument As Variant
Dim FirstArgumentCounter As Integer, SecondArgumentCounter As Integer
Dim Checker As Boolean
WORDREMOVE = ""
FirstArgument = Split(FirstCell, " ")
SecondArgument = Split(SecondCell, " ")
For SecondArgumentCounter = 0 To UBound(SecondArgument)
Checker = False
For FirstArgumentCounter = 0 To UBound(FirstArgument)
If SecondArgument(SecondArgumentCounter) = FirstArgument(FirstArgumentCounter) Then
Checker = True
End If
Next FirstArgumentCounter
If Checker = False Then WORDREMOVE = WORDREMOVE & SecondArgument(SecondArgumentCounter) & " "
Next SecondArgumentCounter
WORDREMOVE = Left(WORDREMOVE, Len(WORDREMOVE) - 1)
End Function

Implementing a simple substitution cipher using VBA

I am trying to make a program that changes letters in a string and i keep running into the obvious issue of if it changes a value, say it changes A to M, when it gets to M it will then change that M to something else, so when i run the code to change it all back it converts it as if the letter was originally an M not an A.
Any ideas how to make it so the code doesnt change letters its already changed?
as for code ive just got about 40 lines of this (im sure theres a cleaner way to do it but im new to vba and when i tried select case it would only change one letter and not go through all of them)
Text1.value = Replace(Text1.value, "M", "E")
Try this:
Dim strToChange As String
strToChange = "This is my string that will be changed"
Dim arrReplacements As Variant
arrReplacements = Array(Array("a", "m"), _
Array("m", "z"), _
Array("s", "r"), _
Array("r", "q"), _
Array("t", "a"))
Dim strOutput As String
strOutput = ""
Dim i As Integer
Dim strCurrentLetter As String
For i = 1 To Len(strToChange)
strCurrentLetter = Mid(strToChange, i, 1)
Dim arrReplacement As Variant
For Each arrReplacement In arrReplacements
If (strCurrentLetter = arrReplacement(0)) Then
strCurrentLetter = Replace(strCurrentLetter, arrReplacement(0), arrReplacement(1))
Exit For
End If
Next
strOutput = strOutput & strCurrentLetter
Next
Here is the output:
Thir ir zy raqing ahma will be chmnged
Loop through it using the MID function. Something like:
MyVal = text1.value
For X = 1 to Len(MyVal)
MyVal = Replace(Mid(MyVal, X, 1), "M", "E")
X = X + 1
Next X
EDIT: OK upon further light, I'm gonna make one change. Store the pairs in a table. Then you can use DLookup to do the translation, using the same concept:
MyVal = text1.value
For X = 1 to Len(MyVal)
NewVal = DLookup("tblConvert", "fldNewVal", "fldOldVal = '" & Mid(MyVal, X, 1) & "")
MyVal = Replace(Mid(MyVal, X, 1), Mid(MyVal, X, 1), NewVal)
X = X + 1
Next X
Here's another way that uses less loops
Public Function Obfuscate(sInput As String) As String
Dim vaBefore As Variant
Dim vaAfter As Variant
Dim i As Long
Dim sReturn As String
sReturn = sInput
vaBefore = Split("a,m,s,r,t", ",")
vaAfter = Split("m,z,r,q,a", ",")
For i = LBound(vaBefore) To UBound(vaBefore)
sReturn = Replace$(sReturn, vaBefore(i), "&" & Asc(vaAfter(i)))
Next i
For i = LBound(vaAfter) To UBound(vaAfter)
sReturn = Replace$(sReturn, "&" & Asc(vaAfter(i)), vaAfter(i))
Next i
Obfuscate = sReturn
End Function
It turns every letter into an ampersand + the replacement letters ascii code. Then it turns every ascii code in the replacement letter.
It took about 5 milliseconds vs 20 milliseconds for the nested loops.

VB.net Question with array search

I have 10 lines of array that are first name space last name space zip code. All the zip codes start with different numbers. Is there a way to replace the #1 in the indexof below so that it searches for any number character instead?
'open file
inFile = IO.File.OpenText("Names.txt")
'process the loop instruct until end of file
intSubscript = 0
Do Until inFile.Peek = -1 OrElse intSubscript = strLine.Length
strLine(intSubscript) = inFile.ReadLine
intSubscript = intSubscript + 1
Loop
inFile.Close()
intSubscript = 0
strFound = "N"
Do Until strFound = "Y" OrElse intSubscript = strLine.Length
intIndex = strLine(intSubscript).IndexOf("1")
strName = strLine(intSubscript).Substring(0, intIndex - 1)
If strName = strFullname Then
strFound = "Y"
strZip = strLine(intSubscript).Substring(strLine(intSubscript).Length - 5, 5)
txtZip.Text = strZip
End If
Loop
End Sub
use a regular expression.
Regular expressions allow you to do pattern matching on text. It's like String.IndexOf() with wildcard support.
For example, suppose your source data looks like this:
James Harvey 10939
Madison Whittaker 33893
George Keitel 22982
...and so on.
Expressed in English, the pattern each line follows is this:
the beginning of the string, followed by
a sequence of 1 or more alphabetic characters, followed by
a sequence of one or more spaces, followed by
a sequence of 1 or more alphabetic characters, followed by
a sequence of one or more spaces, followed by
a sequence of 5 numeric digits, followed by
the end of the string
You can express that very precisely and succintly in regex this way:
^([A-Za-z]+) +([A-Za-z]+) +([0-9]{5})$
Apply it in VB this way:
Dim sourcedata As String = _
"James Harvey 10939" & _
vbcrlf & _
"Madison Whittaker 33893" & _
vbcrlf & _
"George Keitel 22982"
Dim regex = "^([A-Za-z]+) +([A-Za-z]+) +([0-9]{5})$"
Dim re = New Regex(regex)
Dim lineData As String() = sourceData.Split(vbcrlf.ToCharArray(), _
StringSplitOptions.RemoveEmptyEntries )
For i As Integer = 0 To lineData.Length -1
System.Console.WriteLine("'{0}'", lineData(i))
Dim matchResult As Match = re.Match(lineData(i))
System.Console.WriteLine(" zip: {0}", matchResult.Groups(3).ToString())
Next i
To get that code to compile, you must import the System.Text.RegularExpressions namespace at the top of your VB module, to get the Regex and Match types.
If your input data follows a different pattern, then you will need to adjust your regex.
For example if it could be "Chris McElvoy III 29828", then you need to adjust the regex accordingly, to handle the name suffix.