VB.net Question with array search - vb.net

I have 10 lines of array that are first name space last name space zip code. All the zip codes start with different numbers. Is there a way to replace the #1 in the indexof below so that it searches for any number character instead?
'open file
inFile = IO.File.OpenText("Names.txt")
'process the loop instruct until end of file
intSubscript = 0
Do Until inFile.Peek = -1 OrElse intSubscript = strLine.Length
strLine(intSubscript) = inFile.ReadLine
intSubscript = intSubscript + 1
Loop
inFile.Close()
intSubscript = 0
strFound = "N"
Do Until strFound = "Y" OrElse intSubscript = strLine.Length
intIndex = strLine(intSubscript).IndexOf("1")
strName = strLine(intSubscript).Substring(0, intIndex - 1)
If strName = strFullname Then
strFound = "Y"
strZip = strLine(intSubscript).Substring(strLine(intSubscript).Length - 5, 5)
txtZip.Text = strZip
End If
Loop
End Sub

use a regular expression.
Regular expressions allow you to do pattern matching on text. It's like String.IndexOf() with wildcard support.
For example, suppose your source data looks like this:
James Harvey 10939
Madison Whittaker 33893
George Keitel 22982
...and so on.
Expressed in English, the pattern each line follows is this:
the beginning of the string, followed by
a sequence of 1 or more alphabetic characters, followed by
a sequence of one or more spaces, followed by
a sequence of 1 or more alphabetic characters, followed by
a sequence of one or more spaces, followed by
a sequence of 5 numeric digits, followed by
the end of the string
You can express that very precisely and succintly in regex this way:
^([A-Za-z]+) +([A-Za-z]+) +([0-9]{5})$
Apply it in VB this way:
Dim sourcedata As String = _
"James Harvey 10939" & _
vbcrlf & _
"Madison Whittaker 33893" & _
vbcrlf & _
"George Keitel 22982"
Dim regex = "^([A-Za-z]+) +([A-Za-z]+) +([0-9]{5})$"
Dim re = New Regex(regex)
Dim lineData As String() = sourceData.Split(vbcrlf.ToCharArray(), _
StringSplitOptions.RemoveEmptyEntries )
For i As Integer = 0 To lineData.Length -1
System.Console.WriteLine("'{0}'", lineData(i))
Dim matchResult As Match = re.Match(lineData(i))
System.Console.WriteLine(" zip: {0}", matchResult.Groups(3).ToString())
Next i
To get that code to compile, you must import the System.Text.RegularExpressions namespace at the top of your VB module, to get the Regex and Match types.
If your input data follows a different pattern, then you will need to adjust your regex.
For example if it could be "Chris McElvoy III 29828", then you need to adjust the regex accordingly, to handle the name suffix.

Related

Extracting last name from a range having suffixes using VBA

I have a list of full names in a column like for example:
Dave M. Butterworth
Dave M. Butterworth,II
H.F. Light jr
H.F. Light ,jr.
H.F. Light sr
Halle plumerey
The names are in a column. The initials are not limited to these only.
I want to extract the last name using a generic function. Is it possible?
Consider the following UDF:
Public Function LastName(sIn As String) As String
Dim Ka As Long, t As String
ary = Split(sIn, " ")
Ka = UBound(ary)
t = ary(Ka)
If t = "jr" Or t = ",jr" Or t = "sr" Or t = ",jr." Then
Ka = Ka - 1
End If
t = ary(Ka)
If InStr(1, t, ",") = 0 Then
LastName = t
Exit Function
End If
bry = Split(t, ",")
LastName = bry(LBound(bry))
End Function
NOTE:
You will have to expand this line:
If t = "jr" Or t = ",jr" Or t = "sr" Or t = ",jr." Then
to include all other initial sets you wish to exclude.You will also have to update this code to handle other exceptions as you find them !
Remove punctuation, split to an array and walk backwards until you find a string that does not match a lookup of ignorable monikers like "ii/jr/sr/dr".
You could also add a check to eliminate tokens based on their length.
Function LastName(name As String) As String
Dim parts() As String, i As Long
parts = Split(Trim$(Replace$(Replace$(name, ",", ""), ".", "")), " ")
For i = UBound(parts) To 0 Step -1
Select Case UCase$(parts(i))
Case "", "JR", "SR", "DR", "I", "II"
Case Else:
LastName = parts(i)
Exit Function
End Select
Next
End Function

How to find word between character and a number in VBA

For example, I have this string that reads "IRS150Sup2500Vup". It could also be "IRS250Sdown1250Vdown".
In my previous qn, I asked how to find a number between 2 characters. Now, I need to find the word up or down after the second S now. Since it appears between the character S and the number, how do I do it?
My code looks like this:
Dim pos, pos1,pos2 strString As String
pos = InStr(1, objFile.Name, "S") + 1
pos1 = InStr(pos, objFile.Name, "S")
pos2 = InStr(pos1, objFile.Name, ?)
pos1 returns the index of the second S. I am not sure what to place in ?
Using Regex.
Note: you need a reference to MS VBScripts Regular Expression library.
Dim r As VBScript_RegExp_55.RegExp
Dim sPattern As String, myString As String
Dim mc As VBScript_RegExp_55.MatchCollection, m As VBScript_RegExp_55.Match
myString = "IRS150Sup2500Vup"
sPattern = "\w?up+" 'searches for Sup, Vup, etc.
Set r = New VBScript_RegExp_55.RegExp
r.Pattern = sPattern
Set mc = r.Execute(myString)
For Each m In mc ' Iterate Matches collection.
MsgBox "word: '" & m.Value & "' founded at: " & m.FirstIndex & " length: " & m.Length
Next
For further information, please see:
How To Use Regular Expressions in Microsoft Visual Basic 6.0
Find and replace text by using regular expressions (Advanced)

Query or VBA Function for adding leading zeroes to a field with special conditions

I have a macro I am trying to turn into a VBA Function or Query for adding leading zeros to a field.
For my circumstances, their needs to be 4 numeric digits plus any alphabetic characters that follow so a simple format query doesn't do the trick.
The macro I have uses Evaluate and =Match but I am unsure how this could be achieved in Access.
Sub Change_Number_Format_In_String()
Dim iFirstLetterPosition As Integer
Dim sTemp As String
For Each c In Range("A2:A100")
If Len(c) > 0 Then
iFirstLetterPosition = Evaluate("=MATCH(TRUE,NOT(ISNUMBER(1*MID(" & c.Address & ",ROW($1:$20),1))),0)")
sTemp = Left(c, iFirstLetterPosition - 1) 'get the leading numbers
sTemp = Format(sTemp, "0000") 'format the numbers
sTemp = sTemp & Mid(c, iFirstLetterPosition, Len(c)) 'concatenate the remainder of the string
c.NumberFormat = "#"
c.Value = sTemp
End If
Next
End Sub
In my database the field in need of formatting is called PIDNUMBER
EDIT:
To expand on why FORMAT doesnt work in my situation. Some PIDNUMBERS have an alpha character after the number that should not be counted when determining how many zeroes to add.
In example:
12 should become 0012
12A should become 0012A
When using format, it counts the letters as part of the string, so 12A would become 012A instead of 0012A as intended.
You could try:
Public Function customFormat(ByRef sString As String) As String
customFormat = Right("0000" & sString, 4 + Len(sString) - Len(CStr(Val(sString))))
End Function
Try utilize this function, if you only want this to be available in VBA, put Private in front of the Function:
Function ZeroPadFront(oIn As Variant) As String
Dim zeros As Long, sOut As String
sOut = CStr(oIn)
zeros = 4 - Len(sOut)
If zeros < 0 Then zeros = 0
ZeroPadFront = String(zeros, "0") & sOut
End Function
The Val() function converts a string to a number, and strips off any trailing non-numeric characters. We can use it to figure out how many digits the numeric portion has:
Function PadAlpha$(s$)
Dim NumDigs As Long
NumDigs = Len(CStr(Val(s)))
If NumDigs < 4 Then
PadAlpha = String$(4 - NumDigs, "0") & s
Else
PadAlpha = s
End If
End Function
? padalpha("12")
> 0012
? padalpha("12a")
> 0012a
Bill,
See if this will work. It seems like a function would better suit you.
Function NewPIDNumber(varPIDNumber As Variant) As String
Dim lngLoop As Long
Dim strChar As String
For lngLoop = 1 to Len(varPIDNumber)
strChar = Mid(varPIDNumber, lngLoop, 1)
If IsNumeric(strChar) Then
NewPIDNumber = NewPIDNumber & strChar
Else
Exit For
End If
Next lngLoop
If Len(NewPIDNumber) > 4 Then
MsgBox "Bad Data Maaaaan...." & Chr(13) & Chr(13) & "The record = " & varPIDNumber
Exit Function
End If
Do Until Len(NewPIDNumber) = 4
NewPIDNumber = "0" & NewPIDNumber
Loop
End Function
Data Result
012a 0012
12a 0012
12 0012
85 0085
85adfe 0085
1002a 1002
1002 1002

Find and replace all names of variables in VBA module

Let's assume that we have one module with only one Sub in it, and there are no comments. How to identify all variable names ? Is it possible to identify names of variables which are not defined using Dim ? I would like to identify them and replace each with some random name to obfuscate my code (O0011011010100101 for example), replace part is much easier.
List of characters which could be use in names of macros, functions and variables :
ABCDEFGHIJKLMNOPQRSTUVWXYZdefghijklmnopqrstuvwxyzg€‚„…†‡‰Š‹ŚŤŽŹ‘’“”•–—™š›śťžź ˇ˘Ł¤Ą¦§¨©Ş«¬­®Ż°±˛ł´µ¶·¸ąş»Ľ˝ľżŔÁÂĂÄĹĆÇČÉĘËĚÍÎĎĐŃŇÓÔŐÖ×ŘŮÚŰÜÝŢßŕáâăäĺćçčéęëěíîďđńňóôőö÷řůúűüýţ˙ÉĘËĚÍÎĎĐŃŇÓÔŐÖ×ŘŮÚŰÜÝŢßŕáâăäĺćçčéęëěíîďđńňóôőö÷řůúűüýţ˙
Below are my function I've wrote recenlty :
Function randomName(n as integer) as string
y="O"
For i = 2 To n:
If Rnd() > 0.5 Then
y = y & "0"
Else
y = y & "1"
End If
Next i
randomName=y
End Function
In goal to replace given strings in another string which represent the code of module I use below sub :
Sub substituteNames()
'count lines in "Module1" which is part of current workbook
linesCount = ActiveWorkbook.VBProject.VBComponents("Module1").CodeModule.CountOfLines
'read code from module
code = ActiveWorkbook.VBProject.VBComponents("Module1").CodeModule.Lines(StartLine:=1, Count:=linesCount)
inputStr = Array("name1", "name2", "name2") 'some hardwritten array with string to replace
namesLength = 20 'length of new variables names
For i = LBound(inputStr) To UBound(inputStr)
outputString = randomName(namesLength-1)
code = Replace(code, inputStr(i), outputString)
Next i
Debug.Print code 'view code
End Sub
then we simply substitute old code with new one, but how to identify strings with names of variables ?
Edition
Using **Option Explicit ** decrease safety of my simple method of obfuscation, because to reverse changes you only have to follow Dim statements and replace ugly names with something normal. Except that to make such substitution harder, I think it's good idea to break the line in the middle of variable name :
O0O000O0OO0O0000 _
0O00000O0OO0
the simple method is also replacing some strings with chains based on chr functions chr(104)&chr(101)&chr(108)&chr(108)&chr(111) :
Sub stringIntoChrChain()
strInput = "hello"
strOutput = ""
For i = 1 To Len(strInput)
strOutput = strOutput & "chr(" & Asc(Mid(strInput, i, 1)) & ")&"
Next i
Debug.Print Mid(strOutput, 1, Len(strOutput) - 1)
End Sub
comments like below could make impression on user and make him think that he does not poses right tool to deal with macro etc.:
'(k=Äó¬)w}ż^¦ů‡ÜOyúm=ěËnóÚŽb W™ÄQó’ (—*-ĹTIäb
'R“ąNPÔKZMţ†üÍQ‡
'y6ű˛Š˛ŁŽ¬=iýQ|˛^˙ ‡ńb ¬ĂÇr'ń‡e˘źäžŇ/âéç;1qýěĂj$&E!V?¶ßšÍ´cĆ$Âű׺Ůî’ﲦŔ?TáÄu[nG¦•¸î»éüĽ˙xVPĚ.|
'ÖĚ/łó®Üă9Ę]ż/ĹÍT¶Mµę¶mÍ
'q[—qëýY~Pc©=jÍ8˘‡,Ú+ń8ŐűŻEüńWü1ďëDZ†ć}ęńwŠbŢ,>ó’Űçµ™Š_…qÝăt±+‡ĽČg­řÍ!·eŠP âńđ:ŶOážű?őë®ÁšńýĎáËTbž}|Ö…ăË[®™
You can use a regular expression to find variable assignments by looking for the equals sign. You'll need to add a reference to the Microsoft VBScript Regular Expressions 5.5 and Microsoft Visual Basic for Applications Extensibility 5.3 libraries as I've used early binding.
Please be sure to back up your work and test this before using it. I could have gotten the regex wrong.
UPDATE:
I've refined the regular expressions so that it no longer catches datatypes of strongly typed constants (Const ImAConstant As String = "Oh Noes!" previously returned String). I've also added another regex to return those constants as well. The last version of the regex also mistakenly caught things like .Global = true. That was corrected. The code below should return all variable and constant names for a given code module. The regular expressions still aren't perfect, as you'll note that I was unable to stop false positives on double quotes. Also, my array handling could be done better.
Sub printVars()
Dim linesCount As Long
Dim code As String
Dim vbPrj As VBIDE.VBProject
Dim codeMod As VBIDE.CodeModule
Dim regex As VBScript_RegExp_55.RegExp
Dim m As VBScript_RegExp_55.match
Dim matches As VBScript_RegExp_55.MatchCollection
Dim i As Long
Dim j As Long
Dim isInDatatypes As Boolean
Dim isInVariables As Boolean
Dim datatypes() As String
Dim variables() As String
Set vbPrj = VBE.ActiveVBProject
Set codeMod = vbPrj.VBComponents("Module1").CodeModule
code = codeMod.Lines(1, codeMod.CountOfLines)
Set regex = New RegExp
With regex
.Global = True ' match all instances
.IgnoreCase = True
.MultiLine = True ' "code" var contains multiple lines
.Pattern = "(\sAs\s)([\w]*)(?=\s)" ' get list of datatypes we've used
' match any whole word after the word " As "
Set matches = .Execute(code)
End With
ReDim datatypes(matches.count - 1)
For i = 0 To matches.count - 1
datatypes(i) = matches(i).SubMatches(1) ' return second submatch so we don't get the word " As " in our array
Next i
With regex
.Pattern = "(\s)([^\.\s][\w]*)(?=\s\=)" ' list of variables
' begins with a space; next character is not a period (handles "with" assignments) or space; any alphanumeric character; repeat until... space
Set matches = .Execute(code)
End With
ReDim variables(matches.count - 1)
For i = 0 To matches.count - 1
isInDatatypes = False
isInVariables = False
' check to see if current match is a datatype
For j = LBound(datatypes) To UBound(datatypes)
If matches(i).SubMatches(1) = datatypes(j) Then
isInDatatypes = True
Exit For
End If
'Debug.Print matches(i).SubMatches(1)
Next j
' check to see if we already have this variable
For j = LBound(variables) To i
If matches(i).SubMatches(1) = variables(j) Then
isInVariables = True
Exit For
End If
Next j
' add to variables array
If Not isInDatatypes And Not isInVariables Then
variables(i) = matches(i).SubMatches(1)
End If
Next i
With regex
.Pattern = "(\sConst\s)(.*)(?=\sAs\s)" 'strongly typed constants
' match anything between the words " Const " and " As "
Set matches = .Execute(code)
End With
For i = 0 To matches.count - 1
'add one slot to end of array
j = UBound(variables) + 1
ReDim Preserve variables(j)
variables(j) = matches(i).SubMatches(1) ' again, return the second submatch
Next i
' print variables to immediate window
For i = LBound(variables) To UBound(variables)
If variables(i) <> "" And variables(i) <> Chr(34) Then ' for the life of me I just can't get the regex to not match doublequotes
Debug.Print variables(i)
End If
Next i
End Sub

Get only the line of text that contains the given word VB2010.net

I have a text file on my website and I download the whole string via webclient.downloadstring.
The text file contains this :
cookies,dishes,candy,(new line)
back,forward,refresh,(new line)
mail,media,mute,
This is just an example it's not the actual string , but it will do for help purposes.
What I want is I want to download the whole string , find the line that contains the word that was entered by the user in a textbox, get that line into a string, then I want to use the string.split with as delimiter the "," and output each word that is in the string into an richtextbox.
Now here is the code that I have used (some fields are removed for privacy reasons).
If TextBox1.TextLength > 0 Then
words = web.DownloadString("webadress here")
If words.Contains(TextBox1.Text) Then
'retrieval code here
Dim length As Integer = TextBox1.TextLength
Dim word As String
word = words.Substring(length + 1) // the plus 1 is for the ","
Dim cred() As String
cred = word.Split(",")
RichTextBox1.Text = "Your word: " + cred(0) + vbCr + "Your other word: " + cred(1)
Else
MsgBox("Sorry, but we could not find the word you have entered", MsgBoxStyle.Critical)
End If
Else
MsgBox("Please fill in an word", MsgBoxStyle.Critical)
End If
Now it works and no errors , but it only works for line 1 and not on line 2 or 3
what am I doing wrong ?
It's because the string words also contains the new line characters that you seem to be omitting in your code. You should first split words with the delimiter \n (or \r\n, depending on the platform), like this:
Dim lines() As String = words.Split("\n")
After that, you have an array of strings, each element representing a single line. Loop it through like this:
For Each line As String In lines
If line.Contains(TextBox1.Text) Then
'retrieval code here
End If
Next
Smi's answer is correct, but since you're using VB you need to split on vbNewLine. \n and \r are for use in C#. I get tripped up by that a lot.
Another way to do this is to use regular expressions. A regular expression match can both find the word you want and return the line that contains it in a single step.
Barely tested sample below. I couldn't quite figure out if your code was doing what you said it should be doing so I improvised based on your description.
Imports System.Text.RegularExpressions
Public Class Form1
Private Sub ButtonFind_Click(sender As System.Object, e As System.EventArgs) Handles ButtonFind.Click
Dim downloadedString As String
downloadedString = "cookies,dishes,candy," _
& vbNewLine & "back,forward,refresh," _
& vbNewLine & "mail,media,mute,"
'Use the regular expression anchor characters (^$) to match a line that contains the given text.
Dim wordToFind As String = TextBox1.Text & "," 'Include the comma that comes after each word to avoid partial matches.
Dim pattern As String = "^.*" & wordToFind & ".*$"
Dim rx As Regex = New Regex(pattern, RegexOptions.Multiline + RegexOptions.IgnoreCase)
Dim M As Match = rx.Match(downloadedString)
'M will either be Match.Empty (no matching word was found),
'or it will be the matching line.
If M IsNot Match.Empty Then
Dim words() As String = M.Value.Split(","c)
RichTextBox1.Clear()
For Each word As String In words
If Not String.IsNullOrEmpty(word) Then
RichTextBox1.AppendText(word & vbNewLine)
End If
Next
Else
RichTextBox1.Text = "No match found."
End If
End Sub
End Class