String Manipulation Inconsistency - vb.net

This is a more general question. I am reading a document and saving its contents into a string variable. The resulting variable contains approximately 1 million characters (no cleansing). My code would then search the string, and extract key words. However, I am hung-up on an issue:
If I pass the string directly to a message box, it will show me the contents using Mid:
Messagebox.Show(Mid(searchString, startPos, endPos))
However, if I first pass the mid to a string variable, the contents are empty and the messagebox displays nothing:
Dim myString as String
myString = Mid(searchString, startPos, endPos)
Messagebox.Show(myString)
The same effect happens when I use .substring and when I use a stringbuilder.
Does anybody know why this is happening? I assume something is happening during assignment, but I am not sure what is lost?
Here is a snippet of code:
searchPos = textString.IndexOf(searchText, searchPos, StringComparison.OrdinalIgnoreCase)
MessageBox.Show(searchPos)
MessageBox.Show(Mid(textString, searchPos, 100))
So, the inconsistency is as such: the length of textString is around 3,700,000 characters. When I find the indexOf, the value returned in the first Messagebox is 455,225. However, if I try to pull out the characters using Mid, the second messagebox is blank.
Also, although it claims to be 3,700,000 characters, if I do a messagebox on textString, I am only shown around 6 characters of what appears to be XML. The file that is being searched is an old .ppt file, and I know I can just work-around it, but I am confused by how the computer can find the indexof my searchText correctly, but then cannot show me anything.
Thoughts?

Related

Case sensitive search from text file fix?

A few days ago, I asked a question on Stack Overflow, asking how to search a text file for matching strings from a search text box. This has worked great so far, except from the fact that the search was case sensitive. I thought of a way of overcoming this, however it wouldn't work in the way I necessarily wanted it to.
My idea/solution:
If ListBox.Items.Count = 0 Then
tbx_FindText.CharacterCasing = CharacterCasing.Upper
ElseIf ListBox.Items.Count = 0 Then
tbx_FindText.CharacterCasing = CharacterCasing.Lower
End If
This would essentially try both fully upper and lower case, but what happens if the user types a search request such as 'Gsk', well the 'G' is capitalized, but the other characters aren't (because the string is mixed case, not fully upper or lower case), and if it is not the exact same as the string in the text file (whether it be fully upper or lower case or mixed case, then the program reports that there are no search results, when there are - it's just that the search algorithm used is case sensitive and doesn't recognize/search it properly.
Search Algorithm Code:
Dim lines1() As String = IO.File.ReadAllLines("C:\ProgramData\WPSECHELPER\.data\Outlook Folder Wizard\outlookfolders.txt")
lbx_OFL_Results.Items.Clear()
lbx_OFL_Results.BeginUpdate()
For i As Integer = 0 To lines1.Length - 1
If lines1(i).Contains(tbx_FindText.Text) Then lbx_OFL_Results.Items.Add(lines1(i))
Next
lbx_OFL_Results.EndUpdate()
Essentially, the code opens the text file, which contains several Outlook Folder Paths needed by employees to do their jobs. They enter a search for a company name or reference number into a search box, and the list box populates with matching results of paths that contain the keywords that were entered in the search text box.
That part works great - apart from the fact the list box doesn't populate with results if my search is capitalized, and the string in the text file isn't, for example.
If anyone could help compose (or reconstruct) a piece of code that searches the text file (trying to keep the code above if possible) whilst the search not being case sensitive, it would be greatly appreciated.
Don't use the ReadAllLines function since you don't need to get all the lines from the text file. This function loads everything in memory which is unnecessary especially when you are dealing with big files. Use ReadLines instead with the where extension function to get the matches:
Dim path As String = "C:\ProgramData\WPSECHELPER\.data\Outlook Folder Wizard\outlookfolders.txt"
Dim search As String = tbx_FindText.Text
Dim lines = File.ReadLines(path).Where(
Function(l) l.IndexOf(search, 0, StringComparison.InvariantCultureIgnoreCase) >= 0
).ToList
lbx_OFL_Results.DataSource = Nothing
lbx_OFL_Results.DataSource = lines

VB.NET - Substring function that stops reading at first integer, possible?

I currently have a text file that has a little over 500 lines of paths.
(i.e., N:\Fork\Cli\Scripts\ABC01.VB)
Some of these file names vary in length (i.e., ABC01.VB, ABCDEF123.VB, etc)
How can I go about using substring function to remove the path name, numbers, and file type, leaving just the letters.
For example, processing N:\Fork\Cli\Scripts\ABC01.VB, and returning ABC.
Or N:\Fork\Cli\Scripts\ZUBDK22039.VB and returning ZUBDK.
I've only been able to retrieve the first 3 letters using this code
Dim comp As String = sLine.Substring(28, 3)
sw.WriteLine(comp)
As Plutonix points out, the best way to isolate the file name from a path is with System.IO.Path.GetFileNameWithoutExtension.
You can extract just the letters (not digits or other characters) from a filename like this:
Dim myPath As String = "N:\Fork\Cli\Scripts\AB42Cde01.VB"
Dim filename As String = System.IO.Path.GetFileNameWithoutExtension(myPath)
Dim letters As String = filename.Where(Function(c) Char.IsLetter(c)).ToArray
The above code sets letters to ABCde.
The code relies on the fact that Strings are treated like arrays of characters. The Where method processes all the characters in the string (array) and selects only the ones that are letters (using the Char.IsLetter method). The selected characters are converted to an array (string) that is assigned to the letters variable.
I see from your latest comment that it is not possible for numerals to be mixed with the letters (as in my example). However, the code should still work in your case.

Read a text file and display result in a window

I have a text file which contains about 60 lines. I would like to parse out all the text from that file and display in a window. The text file contains words that are separated by an underscore. I would like to use regular expression to solve this problem.
Update:
This is my code as of now. I am trying to read "filename" in my code.
Dim filename = "D:\databases.txt"
Dim regexpression As String = "/^[^_]*_([^_]*)\w/"
I know I don't have much done here anyway but I am trying to learn VB on my own and have gotten stuck here.
Please feel free to suggest what I should be doing instead.
Something like this:
TextBox1.Lines = IO.File.ReadAllLines("fileName")
To remove underscores:
TextBox1.Lines = IO.File.ReadAllLines("fileName").Replace("_", String.Empty)
If you also need other special characters removed, you can use Regex.Replace:
Remove special characters from a string
Also on MSDN:
How to: Strip Invalid Characters from a String
Or the old school way - loop through all characters, and filter only those you need:
Most efficient way to remove special characters from string

how to remove white spaces except vbCr/vbCrlf/newline in vb.net

I am currently using this code to remove to much new lines from an explode string..
Me.rtb.Lines = Me.rtb.Text.Split(New Char() {ControlChars.Lf}, _
StringSplitOptions.RemoveEmptyEntries)
I use RichTextBox. I split those string with
incoming = stringOfRtb.Split(ControlChars.CrLf.ToCharArray) ''vcrlf splitter
as a result, i get strings per line.. but sometimes, I think the first code removes not only the white spaces, but also the vbCrlf or the newlines that the module sends back. now the awful thing here is, it appears on random places so the strings that I put into textboxes shuffles and gets other arrays every time I receive the same data.
sometimes, its like this..
While rtb.Text.contains("vbLf")
rtb.txt = rbt.txt.replace("vbLfvbLf","vbLF")
end while
this code should make double chars into one.

controlP5 textfield contents. Processing

I have a sketch in processing I am working on which contains a textfield and a submit button. When the submit button is pressed, a file with is created using the name given in the textfield. I want to make sure something has been entered into the textfield when the submit button is pressed, however, it appears that by default the string is not empty or contain white space and is not caught by if statements.
Is there any simple way to check that something has been entered in the text field without needing to resort to something like regex?
I am not sure I understood whether by default your string is not empty and also does not contain white space (which would make it an odd example). The best possible check I can think of is to trim whatever the entered string is and then check if it is empty:
if(enteredString.trim().length() > 0) println("The string is valid");
the trim() method trims leading and trailing spaces, so if there are only spaces they will be removed making the string empty. Also, since you are saving files you might want to check for invalid characters. With Processing (Java) you don't necessarily have to resort to regex since you can do stuff like these:
String s = "ashd/ah";
println(s.contains("/"));
println(s.replace("/","-"));
which will print:
true
ashd-ah