VB.NET - Substring function that stops reading at first integer, possible? - vb.net

I currently have a text file that has a little over 500 lines of paths.
(i.e., N:\Fork\Cli\Scripts\ABC01.VB)
Some of these file names vary in length (i.e., ABC01.VB, ABCDEF123.VB, etc)
How can I go about using substring function to remove the path name, numbers, and file type, leaving just the letters.
For example, processing N:\Fork\Cli\Scripts\ABC01.VB, and returning ABC.
Or N:\Fork\Cli\Scripts\ZUBDK22039.VB and returning ZUBDK.
I've only been able to retrieve the first 3 letters using this code
Dim comp As String = sLine.Substring(28, 3)
sw.WriteLine(comp)

As Plutonix points out, the best way to isolate the file name from a path is with System.IO.Path.GetFileNameWithoutExtension.
You can extract just the letters (not digits or other characters) from a filename like this:
Dim myPath As String = "N:\Fork\Cli\Scripts\AB42Cde01.VB"
Dim filename As String = System.IO.Path.GetFileNameWithoutExtension(myPath)
Dim letters As String = filename.Where(Function(c) Char.IsLetter(c)).ToArray
The above code sets letters to ABCde.
The code relies on the fact that Strings are treated like arrays of characters. The Where method processes all the characters in the string (array) and selects only the ones that are letters (using the Char.IsLetter method). The selected characters are converted to an array (string) that is assigned to the letters variable.
I see from your latest comment that it is not possible for numerals to be mixed with the letters (as in my example). However, the code should still work in your case.

Related

Case sensitive search from text file fix?

A few days ago, I asked a question on Stack Overflow, asking how to search a text file for matching strings from a search text box. This has worked great so far, except from the fact that the search was case sensitive. I thought of a way of overcoming this, however it wouldn't work in the way I necessarily wanted it to.
My idea/solution:
If ListBox.Items.Count = 0 Then
tbx_FindText.CharacterCasing = CharacterCasing.Upper
ElseIf ListBox.Items.Count = 0 Then
tbx_FindText.CharacterCasing = CharacterCasing.Lower
End If
This would essentially try both fully upper and lower case, but what happens if the user types a search request such as 'Gsk', well the 'G' is capitalized, but the other characters aren't (because the string is mixed case, not fully upper or lower case), and if it is not the exact same as the string in the text file (whether it be fully upper or lower case or mixed case, then the program reports that there are no search results, when there are - it's just that the search algorithm used is case sensitive and doesn't recognize/search it properly.
Search Algorithm Code:
Dim lines1() As String = IO.File.ReadAllLines("C:\ProgramData\WPSECHELPER\.data\Outlook Folder Wizard\outlookfolders.txt")
lbx_OFL_Results.Items.Clear()
lbx_OFL_Results.BeginUpdate()
For i As Integer = 0 To lines1.Length - 1
If lines1(i).Contains(tbx_FindText.Text) Then lbx_OFL_Results.Items.Add(lines1(i))
Next
lbx_OFL_Results.EndUpdate()
Essentially, the code opens the text file, which contains several Outlook Folder Paths needed by employees to do their jobs. They enter a search for a company name or reference number into a search box, and the list box populates with matching results of paths that contain the keywords that were entered in the search text box.
That part works great - apart from the fact the list box doesn't populate with results if my search is capitalized, and the string in the text file isn't, for example.
If anyone could help compose (or reconstruct) a piece of code that searches the text file (trying to keep the code above if possible) whilst the search not being case sensitive, it would be greatly appreciated.
Don't use the ReadAllLines function since you don't need to get all the lines from the text file. This function loads everything in memory which is unnecessary especially when you are dealing with big files. Use ReadLines instead with the where extension function to get the matches:
Dim path As String = "C:\ProgramData\WPSECHELPER\.data\Outlook Folder Wizard\outlookfolders.txt"
Dim search As String = tbx_FindText.Text
Dim lines = File.ReadLines(path).Where(
Function(l) l.IndexOf(search, 0, StringComparison.InvariantCultureIgnoreCase) >= 0
).ToList
lbx_OFL_Results.DataSource = Nothing
lbx_OFL_Results.DataSource = lines

Word Macro for separating a comma-separated list to columns

I have a very large set of data that represents cartesian coordinates in the form x0,y0,z0,x1,y1,z1...xn,yn,zn. I need to create a new line at the end of each xyz coordinate. I have been trying to record a macro that moves a certain number of spaces from the beginning of each line, then creates a new line. This, of course, will not work since the number of digits in each xyz coordinate differs.
How can I create a macro to do this in Microsoft Word?
Try this:
Public Sub test()
Dim s As String
Dim v As Variant
Dim t As String
Dim I As Long
s = "x0,y0,z0,x1,y1,z1,xn,yn,zn"
v = Split(s, ",")
t = ""
For I = LBound(v) To UBound(v)
t = t + v(I)
If I Mod 3 = 2 Then
t = t + vbCr
Else
t = t + ","
End If
Next I
t = Left(t, Len(t) - 1)
Debug.Print t
End Sub
The Split function splits a string along the delimiter you specify (comma in your case), returning the results in a 0-based array. Then in the For loop we stitch the pieces back together, using a carriage return (vbCR) every third element and a comma otherwise.
The final (optional) step is to remove the trailing carriage return.
Hope that helps
The question placed before us was most clearly asked
“Please produce a macro sufficient to the task
I have Cartesian coordinates, a single line of these
Array them in many lines, triplets if you please!”
Instinctively we start to code, a solution for this quest
Often without asking, “Is this way truly best?”
But then another scheme arises from the mind
That most venerated duo: Word Replace and Find
Provide the two textboxes each an encantation
Check the Wildcard option and prepare for Amazation!
Forgive me!
In Word open Find/Replace
Click the More button and check the Use wildcards box
For Find what enter ([!,]{1,},[!,]{1,},[!,]{1,}),
For Replace with enter \1^p
Use Find Next, Replace and Replace All as usual
How it works
With wildcards, [!,]{1,} finds one or more chars that are NOT commas. This idiom is repeated 3 times with 2 commas separating the 3 instances. This will match 3 comma-delimited coordinates. The whole expression is then wrapped in parentheses to created an auto-numbered group (in this case Group #1). Creating a group allows us to save text that matches the pattern and use it in the Replace box. Outside of the parentheses is one more comma, which separates one triplet of coordinates from the next.
In the Replace box \1 retrieves auto-numbered group 1, which is our coordinate triplet. Following that is ^p which is a new paragraph in Word.
Hope that helps!

How does this line of VB.NET code work?

I am using VB.NET and String.Format.
The line of code below populates s with 20 space characters. The problem is that I don't know how it works and can't find an explanation. Reference: MSDN String.Format Method.
Dim s As String = String.Format("{0, 20}", String.Empty)
It gives me the result I need, a string populated with with 20 space characters, but what is the "0"? If I change that to any other num it creates an error.
And I don't see where / how it's specifying a Space char?
The format specififer {0, 20} indicates that it will place your object string.Empty as element {0} at the end of an empty 20 character string. By that I mean that your item will be used to fill the right side of a 20 character string and the remainder will be padded. Since you're using string.Empty you get a completely blank string. Try adding z and changing the number to a negative number.
string.Format("{0, -10}", "z");
This should give you a 10 character string starting with z and filled with spaces. This is default behavior for string.Format, and it is most commonly used when formatting custom numerics. The space doesn't need to be included as part of the command because that's considered expected by the fact that your specifier indicated you wanted a result string of 20 characters. Space seems the most logical inserted default character.
If you use a complex string like:
string.Format("{0, 10}", "abc");
You should still get a 10 character string but it will look like
" abc"
The 0 in the first parameter is the index of the argument.
The signature of that method is String.Format(string format, object[] params).
So "{0, 20}" is the string to be formatted, and everything else is turned into an object array.
the zero is the index of the parameter that you are sending, in your case String.Empty

Read a text file and display result in a window

I have a text file which contains about 60 lines. I would like to parse out all the text from that file and display in a window. The text file contains words that are separated by an underscore. I would like to use regular expression to solve this problem.
Update:
This is my code as of now. I am trying to read "filename" in my code.
Dim filename = "D:\databases.txt"
Dim regexpression As String = "/^[^_]*_([^_]*)\w/"
I know I don't have much done here anyway but I am trying to learn VB on my own and have gotten stuck here.
Please feel free to suggest what I should be doing instead.
Something like this:
TextBox1.Lines = IO.File.ReadAllLines("fileName")
To remove underscores:
TextBox1.Lines = IO.File.ReadAllLines("fileName").Replace("_", String.Empty)
If you also need other special characters removed, you can use Regex.Replace:
Remove special characters from a string
Also on MSDN:
How to: Strip Invalid Characters from a String
Or the old school way - loop through all characters, and filter only those you need:
Most efficient way to remove special characters from string

String Manipulation Inconsistency

This is a more general question. I am reading a document and saving its contents into a string variable. The resulting variable contains approximately 1 million characters (no cleansing). My code would then search the string, and extract key words. However, I am hung-up on an issue:
If I pass the string directly to a message box, it will show me the contents using Mid:
Messagebox.Show(Mid(searchString, startPos, endPos))
However, if I first pass the mid to a string variable, the contents are empty and the messagebox displays nothing:
Dim myString as String
myString = Mid(searchString, startPos, endPos)
Messagebox.Show(myString)
The same effect happens when I use .substring and when I use a stringbuilder.
Does anybody know why this is happening? I assume something is happening during assignment, but I am not sure what is lost?
Here is a snippet of code:
searchPos = textString.IndexOf(searchText, searchPos, StringComparison.OrdinalIgnoreCase)
MessageBox.Show(searchPos)
MessageBox.Show(Mid(textString, searchPos, 100))
So, the inconsistency is as such: the length of textString is around 3,700,000 characters. When I find the indexOf, the value returned in the first Messagebox is 455,225. However, if I try to pull out the characters using Mid, the second messagebox is blank.
Also, although it claims to be 3,700,000 characters, if I do a messagebox on textString, I am only shown around 6 characters of what appears to be XML. The file that is being searched is an old .ppt file, and I know I can just work-around it, but I am confused by how the computer can find the indexof my searchText correctly, but then cannot show me anything.
Thoughts?