IndexOf Method VB.net - vb.net

How do I use the Indexof Method to search for an Index a number? The number will be different on each line of the file. Each array has a name and a different zip code. I want to tell it to search for the first number in the line. Everything before that index will be first name, last name, and then zip code.
infile = IO.File.OpenText("Names.txt")
'process the loop instruct until end of file
intSubscript = 0
Do Until infile.Peek = -1
'read a line
strLine(intSubscript) = infile.ReadLine
intSubscript = intSubscript + 1
Loop
infile.Close()

A solution from how I understand this:
Instead of using the IndexOf, you can save each part of the file on a different line (ReadLine).
If you really need the IndexOf: It's just String.IndexOf(EnterCharacterHere)
You could also read this file and only use the numbers found:
First you make a const string const cstrNumbers as string = "0123456789" and then do the following:
For x as integer = 0 to strInput -1
strTemporary = strInput.Substring(x,1)
If InStr(cstrNumbers, strTemporary) <> 0 Then
strOutput &= strTemporary
strOutput will contain the numbers then.
Hope this helps,
Simon
EDIT:
This would be easier with a database, but I have no experience with db in vb.net.
You could do a substring combined with the InStr I mentioned.
First you need a function that will return the first occurrence of a number. (With InStr)
And then use this in the substring (String.SubString(FirstOccurence, LengthOfZip)
Don't have the time to do the complete code now..Hope this helps you a bit

Related

Remove Double Quotes While Reading CSV Files in VBA

In VBA, I need to import a few R generated CSV files. However, the split function did not work properly and gaveType mismatch. My best guess is that: VBA added double quotes between each imported line. So the first line becomes " 47.27284, 130.5583, 44.826609, 189.905367". I tried to remove double quotes using replace or remove the first and last character but the error still existed. Any suggestions to deal with this issue?
CSV file
dose_BMD_r, dose_ED_r, dose_BMD_c, dose_ED_c
47.27284, 130.5583, 44.826609, 189.905367
47.27284, 130.5583, 52.226171, 233.338840
47.27284, 130.5583, 8.484266, 6.887616
VBA code
lin_ind = 1
Open text_fn For Input As #1
Do Until EOF(1)
Line Input #1, textline
If lin_ind = 1 Then
'Do nothing
Else
textline_1 = Split(textline, ",")
End If
lin_ind = lin_ind + 1
Loop
Close #1
Split function returns array. So the variable where you store Split's return value, should be array/variant.
Declare it as Dim textline_1 that's it. It will work.
OR Dim textline_1 () As String

Removing CR/LF at end of file in VB.net

I've searched for a solution to this, but any I've found are either doing much more than I need or are not exactly what I want.
I have files I want to append to. I need to append to the end of the last line but they all have a carriage return and so I'll end up appending to the new line if I just append as normal.
All I want is to make a subroutine that takes a file path and removes the CR/LF at the end of it, no more, no less. Any help pointing me at a solution to this would be appreciated. I'm surprised there isn't a built in function to do this.
Dim crString = Environment.NewLine '= vbCrLf
Dim crBytes = Encoding.UTF8.GetBytes(crString)
Dim bytesRead(crBytes.Length - 1) as Byte
Dim iOffset As Integer = 0
Dim stringRead As String
Using fs = File.Open("G:\test.txt", FileMode.Open, FileAccess.ReadWrite)
While iOffset < fs.Length
fs.Seek(- (crBytes.Length + iOffset), SeekOrigin.End)
fs.Read(bytesRead,0, crBytes.Length)
stringRead = Encoding.UTF8.GetString(bytesRead)
If stringRead = crString Then
fs.SetLength(fs.Length - (crBytes.Length * iOffset + 1))
Exit While
End If
iOffset += 1
End While
End Using
I open the text file as FileStream and set its position to the end of the file - length of the carriage return string.
I then read the current bytes while decreasing the offset until I found a carriage return or the eof has been reached.
If a CR has been found I remove it and everything what comes after.
If you don´t want that just remove the loop and check the eof only.
But there could be some vbNullString at the eof that´s why I´m using the loop.
Please note that I used UTF8 encoding in my example. If you have other encodings you have to adapt it accordingly.
test.txt before run:
test.txt after code snippet run:
EDIT: fs.SetLength part was wrong in case of last character in file was not a CR.
I have found String.Replace(ControlChars.CrLf.ToCharArray(),"") works.
Probably better ways to do it as well!

Replacing string at certain index of Split

Using streamreader to read line by line of a text file. When I get to a certain line (i.e., 123|abc|99999||ded||789), I want to replace ONLY the first empty area with text.
So far, I've been toying with
If sLine.Split("|")(3) = "" Then
'This is where I'm stuck, I want to replace that index with mmm
End If
I want the output to look like this: 123|abc|99999|mmm|ded||789
Considering you already have code determining if the "mmm" string needs to be added or not, you could use the following:
Dim index As Integer = sLine.IndexOf("||")
sLine = sLine.Insert(index + 1, "mmm")
You could split the string, modify the array and rejoin it to recreate the string:
Dim sLine = "123|abc|99999||ded||789"
Dim parts = sLine.Split("|")
If parts(3) = "" Then
parts(3) = "mmm"
sLine = String.Join("|", parts)
End If
I gather that if you find one or more empty elements, you want to replace the first empty element with data and leave the rest blank. You can accomplish this by splitting on the pipe to get an array of strings, iterate through the array and replace the first empty element you come across and exit the loop, and then rejoin your array.
Sub Main()
Dim data As String = "123||abc|99999||ded||789"
Dim parts = data.Split("|")
For index = 0 To parts.Length - 1
If String.IsNullOrEmpty(parts(index)) Then
parts(index) = "mmm"
Exit For
End If
Next
data = String.Join("|", parts)
Console.WriteLine(data)
End Sub
Results:
123|mmm|abc|99999||ded||789

Start reading massive text file from the end

I would ask if you could give me some alternatives in my problems.
basically I'm reading a .txt log file averaging to 8 million lines. Around 600megs of pure raw txt file.
I'm currently using streamreader to do 2 passes on those 8 million lines doing sorting and filtering important parts in the log file, but to do so, My computer is taking ~50sec to do 1 complete run.
One way that I can optimize this is to make the first pass to start reading at the end because the most important data is located approximately at the final 200k line(s) . Unfortunately, I searched and streamreader can't do this. Any ideas to do this?
Some general restriction
# of lines varies
size of file varies
location of important data varies but approx at the final 200k line
Here's the loop code for the first pass of the log file just to give you an idea
Do Until sr.EndOfStream = True 'Read whole File
Dim streambuff As String = sr.ReadLine 'Array to Store CombatLogNames
Dim CombatLogNames() As String
Dim searcher As String
If streambuff.Contains("CombatLogNames flags:0x1") Then 'Keyword to Filter CombatLogNames Packets in the .txt
Dim check As String = streambuff 'Duplicate of the Line being read
Dim index1 As Char = check.Substring(check.IndexOf("(") + 1) '
Dim index2 As Char = check.Substring(check.IndexOf("(") + 2) 'Used to bypass the first CombatLogNames packet that contain only 1 entry
If (check.IndexOf("(") <> -1 And index1 <> "" And index2 <> " ") Then 'Stricter Filters for CombatLogNames
Dim endCLN As Integer = 0 'Signifies the end of CombatLogNames Packet
Dim x As Integer = 0 'Counter for array
While (endCLN = 0 And streambuff <> "---- CNETMsg_Tick") 'Loops until the end keyword for CombatLogNames is seen
streambuff = sr.ReadLine 'Reads a new line to flush out "CombatLogNames flags:0x1" which is unneeded
If ((streambuff.Contains("---- CNETMsg_Tick") = True) Or (streambuff.Contains("ResponseKeys flags:0x0 ") = True)) Then
endCLN = 1 'Value change to determine end of CombatLogName packet
Else
ReDim Preserve CombatLogNames(x) 'Resizes the array while preserving the values
searcher = streambuff.Trim.Remove(streambuff.IndexOf("(") - 5).Remove(0, _
streambuff.Trim.Remove(streambuff.IndexOf("(")).IndexOf("'")) 'Additional filtering to get only valuable data
CombatLogNames(x) = search(searcher)
x += 1 '+1 to Array counter
End If
End While
Else
'MsgBox("Something went wrong, Flame the coder of this program!!") 'Bug Testing code that is disabled
End If
Else
End If
If (sr.EndOfStream = True) Then
ReDim GlobalArr(CombatLogNames.Length - 1) 'Resizing the Global array to prime it for copying data
Array.Copy(CombatLogNames, GlobalArr, CombatLogNames.Length) 'Just copying the array to make it global
End If
Loop
You CAN set the BaseStream to the desired reading position, you just cant set it to a specfic LINE (because counting lines requires to read the complete file)
Using sw As New StreamWriter("foo.txt", False, System.Text.Encoding.ASCII)
For i = 1 To 100
sw.WriteLine("the quick brown fox jumps ovr the lazy dog")
Next
End Using
Using sr As New StreamReader("foo.txt", System.Text.Encoding.ASCII)
sr.BaseStream.Seek(-100, SeekOrigin.End)
Dim garbage = sr.ReadLine ' can not use, because very likely not a COMPLETE line
While Not sr.EndOfStream
Dim line = sr.ReadLine
Console.WriteLine(line)
End While
End Using
For any later read attempt on the same file, you could simply save the final position (of the basestream) and on the next read to advance to that position before you start reading lines.
What worked for me was skipping first 4M lines (just a simple if counter > 4M surrounding everything inside the loop), and then adding background workers that did the filtering, and if important added the line to an array, while main thread continued reading the lines. This saved about third of the time at the end of a day.

How to find which delimiter was used during string split (VB.NET)

lets say I have a string that I want to split based on several characters, like ".", "!", and "?". How do I figure out which one of those characters split my string so I can add that same character back on to the end of the split segments in question?
Dim linePunctuation as Integer = 0
Dim myString As String = "some text. with punctuation! in it?"
For i = 1 To Len(myString)
If Mid$(entireFile, i, 1) = "." Then linePunctuation += 1
Next
For i = 1 To Len(myString)
If Mid$(entireFile, i, 1) = "!" Then linePunctuation += 1
Next
For i = 1 To Len(myString)
If Mid$(entireFile, i, 1) = "?" Then linePunctuation += 1
Next
Dim delimiters(3) As Char
delimiters(0) = "."
delimiters(1) = "!"
delimiters(2) = "?"
currentLineSplit = myString.Split(delimiters)
Dim sentenceArray(linePunctuation) As String
Dim count As Integer = 0
While linePunctuation > 0
sentenceArray(count) = currentLineSplit(count)'Here I want to add what ever delimiter was used to make the split back onto the string before it is stored in the array.'
count += 1
linePunctuation -= 1
End While
If you add a capturing group to your regex like this:
SplitArray = Regex.Split(myString, "([.?!])")
Then the returned array contains both the text between the punctuation, and separate elements for each punctuation character. The Split() function in .NET includes text matched by capturing groups in the returned array. If your regex has several capturing groups, all their matches are included in the array.
This splits your sample into:
some text
.
with punctuation
!
in it
?
You can then iterate over the array to get your "sentences" and your punctuation.
.Split() does not provide this information.
You will need to use a regular expression to accomplish what you are after, which I infer as the desire to split an English-ish paragraph into sentences by splitting on punctuation.
The simplest implementation would look like this.
var input = "some text. with punctuation! in it?";
string[] sentences = Regex.Split(input, #"\b(?<sentence>.*?[\.!?](?:\s|$))");
foreach (string sentence in sentences)
{
Console.WriteLine(sentence);
}
Results
some text.
with punctuation!
in it?
But you are going to find very quickly that language, as spoken/written by humans, does not follow simple rules most of the time.
Here it is in VB.NET for you:
Dim sentences As String() = Regex.Split(line, "\b(?<sentence>.*?[\.!?](?:\s|$))")
Once you've called Split with all 3 characters, you've tossed that information away. You could do what you're trying to do by splitting yourself or by splitting on one punctuation mark at a time.