Removing CR/LF at end of file in VB.net - vb.net

I've searched for a solution to this, but any I've found are either doing much more than I need or are not exactly what I want.
I have files I want to append to. I need to append to the end of the last line but they all have a carriage return and so I'll end up appending to the new line if I just append as normal.
All I want is to make a subroutine that takes a file path and removes the CR/LF at the end of it, no more, no less. Any help pointing me at a solution to this would be appreciated. I'm surprised there isn't a built in function to do this.

Dim crString = Environment.NewLine '= vbCrLf
Dim crBytes = Encoding.UTF8.GetBytes(crString)
Dim bytesRead(crBytes.Length - 1) as Byte
Dim iOffset As Integer = 0
Dim stringRead As String
Using fs = File.Open("G:\test.txt", FileMode.Open, FileAccess.ReadWrite)
While iOffset < fs.Length
fs.Seek(- (crBytes.Length + iOffset), SeekOrigin.End)
fs.Read(bytesRead,0, crBytes.Length)
stringRead = Encoding.UTF8.GetString(bytesRead)
If stringRead = crString Then
fs.SetLength(fs.Length - (crBytes.Length * iOffset + 1))
Exit While
End If
iOffset += 1
End While
End Using
I open the text file as FileStream and set its position to the end of the file - length of the carriage return string.
I then read the current bytes while decreasing the offset until I found a carriage return or the eof has been reached.
If a CR has been found I remove it and everything what comes after.
If you don´t want that just remove the loop and check the eof only.
But there could be some vbNullString at the eof that´s why I´m using the loop.
Please note that I used UTF8 encoding in my example. If you have other encodings you have to adapt it accordingly.
test.txt before run:
test.txt after code snippet run:
EDIT: fs.SetLength part was wrong in case of last character in file was not a CR.

I have found String.Replace(ControlChars.CrLf.ToCharArray(),"") works.
Probably better ways to do it as well!

Related

search for filenames in textfiles

I have a powershell script that pulls out lines containing ".html" ".css" and so forth
however what I need is to be able to strip out the entire filename
using a pattern.... the entire pattern is returned example
.........\.html returns
src="blank.html"
my answer came in VB (with a bunch of work and even more research) I wanted to share with you all the results, it's not pretty but it works. is there an easier way?
I have commented the code to help in understanding.
Private Sub find()
Dim reader As StreamReader = My.Computer.FileSystem.OpenTextFileReader(openWork.FileName)
Dim a As String
Dim SearchForThis As String
Dim allfilenames As New System.Text.StringBuilder
Dim first1 As String
Dim FirstCharacter As Integer
'Dim lines As Integer
SearchForThis = txtFind.Text
Do
a = reader.ReadLine 'reader.Readling
If a = "" Then
a = reader.ReadLine
End If
If a Is Nothing Then 'without this check the for loops run with bad data, but I can't check "a" without reading it first.
Else
For FirstCharacter = 2 To a.Length - SearchForThis.Length ' start at 2 to prevent errors in the ")" check
If Mid(a, FirstCharacter, SearchForThis.Length) = SearchForThis Then ' compare the line character by character to find the searchstring
If Mid(a, FirstCharacter - 1, 1) <> ")" Then ' checks for ")" just before the searchstring (a common problem with my .CSS finds)
For y = FirstCharacter To 1 Step -1
If Mid(a, y, 1) = Mid(a, FirstCharacter + SearchForThis.Length, 1) Then ' compares the character after searchstring till I find another one
Dim temp = Mid(a, y + 1, (FirstCharacter + SearchForThis.Length) - 1 - y) ' puts the entire filename into variable "temp"
allfilenames.Append(temp & Chr(13)) 'adds the contents of temp (and a carrage return) to the allfilenames stringbuilder
y = 1
Else
End If
Next
End If
End If
Next
End If
Loop Until a Is Nothing
Document.Text = allfilenames.ToString
reader.Close()
End Sub
(updating for comments... thanks for the input)
each line in the .css search file looks something like this.
addPbrDlg.html:12:<link rel=stylesheet href="swl_styles-5.0o-97581885.css" TYPE="text/css">
addPbrDlg.html:727: html(getFrame(statusFrame).strErrorMessage).css('color','red');
for this I want to return
swl_styles-5.0o-97581885.css
but not return
statusFrame).strErrorMessage).css
basically I want to strip out the file names from HTML code
but if I use a pattern like
.............................\.css
it would return something like
t href="swl_styles-5.0o-97581885.css
Finally... there are some variables that I don't need to worry about (due to my personal situation) like I know that all web pages are ".html" all images are ".gif" there are ".css" and ".js" files as well that I want to pull. But because the designers are extremely consistant I know that there aren't any surprise files (.jpg or .htm)
I can also assume that if there is a single quote after the filename, there will be a single quote before. same with double quote.
Thanks for your input so far... I appreciate your time and knowledge.
You need to use Regex and do something like this
Dim files = Regex.Matches("<your whole file text>", "Your regex pattern");
Your regex pattern will look something like this "\Asrc="".+((\.html)|(\.css))"")". This is probably wrong but when you get that straight follow with
Dim fileList as new List(of String)
For Each file as Match in files
' strip " src=" " and last " " "
fileList.Add(file.Value.Substring(5, file.Value.Length - 6))
Next

How do I replace 'bad characters' in a text file with a space in VB.NET

I'm trying to rid a text file of currupt data. I parse the file and if I find a bad character, I replace it with a space. My problem is that the space is not overwriting the bad character. Instead, the space is written on line 10 position 27. What's going on here?
I've been stuck on this seemingly simple problem for half a day. Thanks.
Sub replaceChars(fname As String)
Dim fs As New FileStream(fname, FileMode.Open, FileAccess.ReadWrite)
Dim r As New StreamReader(fs)
Dim w As New StreamWriter(fs)
Dim iChar As Integer = 0
Do Until r.Peek() = -1
iChar = r.Read()
If iChar < 32 Or iChar > 126 Then
If iChar = 13 Or iChar = 10 Then 'cr/lf, continue.
Continue Do
Else 'found a bad char. replace it.
w.Write(Chr(32))
w.Flush()
fs.Flush()
End If
Else
Continue Do
End If
Loop
w.Close()
fs.Close()
End Sub
I really dont like the idea that you share a filestream for a reader and a writter, since I have had lots of 'unreasonable' problems because of that.
I've leanred that when using a Idiposable objects use 'Using' Statement wich waranties that proceses is finished, and you are not getting wired errors.
try this, maybe it helps and it is always better to do it this way.
Using Statement

Start reading massive text file from the end

I would ask if you could give me some alternatives in my problems.
basically I'm reading a .txt log file averaging to 8 million lines. Around 600megs of pure raw txt file.
I'm currently using streamreader to do 2 passes on those 8 million lines doing sorting and filtering important parts in the log file, but to do so, My computer is taking ~50sec to do 1 complete run.
One way that I can optimize this is to make the first pass to start reading at the end because the most important data is located approximately at the final 200k line(s) . Unfortunately, I searched and streamreader can't do this. Any ideas to do this?
Some general restriction
# of lines varies
size of file varies
location of important data varies but approx at the final 200k line
Here's the loop code for the first pass of the log file just to give you an idea
Do Until sr.EndOfStream = True 'Read whole File
Dim streambuff As String = sr.ReadLine 'Array to Store CombatLogNames
Dim CombatLogNames() As String
Dim searcher As String
If streambuff.Contains("CombatLogNames flags:0x1") Then 'Keyword to Filter CombatLogNames Packets in the .txt
Dim check As String = streambuff 'Duplicate of the Line being read
Dim index1 As Char = check.Substring(check.IndexOf("(") + 1) '
Dim index2 As Char = check.Substring(check.IndexOf("(") + 2) 'Used to bypass the first CombatLogNames packet that contain only 1 entry
If (check.IndexOf("(") <> -1 And index1 <> "" And index2 <> " ") Then 'Stricter Filters for CombatLogNames
Dim endCLN As Integer = 0 'Signifies the end of CombatLogNames Packet
Dim x As Integer = 0 'Counter for array
While (endCLN = 0 And streambuff <> "---- CNETMsg_Tick") 'Loops until the end keyword for CombatLogNames is seen
streambuff = sr.ReadLine 'Reads a new line to flush out "CombatLogNames flags:0x1" which is unneeded
If ((streambuff.Contains("---- CNETMsg_Tick") = True) Or (streambuff.Contains("ResponseKeys flags:0x0 ") = True)) Then
endCLN = 1 'Value change to determine end of CombatLogName packet
Else
ReDim Preserve CombatLogNames(x) 'Resizes the array while preserving the values
searcher = streambuff.Trim.Remove(streambuff.IndexOf("(") - 5).Remove(0, _
streambuff.Trim.Remove(streambuff.IndexOf("(")).IndexOf("'")) 'Additional filtering to get only valuable data
CombatLogNames(x) = search(searcher)
x += 1 '+1 to Array counter
End If
End While
Else
'MsgBox("Something went wrong, Flame the coder of this program!!") 'Bug Testing code that is disabled
End If
Else
End If
If (sr.EndOfStream = True) Then
ReDim GlobalArr(CombatLogNames.Length - 1) 'Resizing the Global array to prime it for copying data
Array.Copy(CombatLogNames, GlobalArr, CombatLogNames.Length) 'Just copying the array to make it global
End If
Loop
You CAN set the BaseStream to the desired reading position, you just cant set it to a specfic LINE (because counting lines requires to read the complete file)
Using sw As New StreamWriter("foo.txt", False, System.Text.Encoding.ASCII)
For i = 1 To 100
sw.WriteLine("the quick brown fox jumps ovr the lazy dog")
Next
End Using
Using sr As New StreamReader("foo.txt", System.Text.Encoding.ASCII)
sr.BaseStream.Seek(-100, SeekOrigin.End)
Dim garbage = sr.ReadLine ' can not use, because very likely not a COMPLETE line
While Not sr.EndOfStream
Dim line = sr.ReadLine
Console.WriteLine(line)
End While
End Using
For any later read attempt on the same file, you could simply save the final position (of the basestream) and on the next read to advance to that position before you start reading lines.
What worked for me was skipping first 4M lines (just a simple if counter > 4M surrounding everything inside the loop), and then adding background workers that did the filtering, and if important added the line to an array, while main thread continued reading the lines. This saved about third of the time at the end of a day.

Creating Fixed Width files from strings

I have searched high and low on the internet and I can't find a straight answer to this !
I have a file that has approx 100,000 characters in one long line.
I need to read this file in and write it out again in its entirety, in lines 102 character long ending with VbCrLf. There are no delimiters.
I thought there were a number of ways to tackle issues like this in VB Script... but
apparently not !
Can anyone please provide me with a pointer ?
Here's something (off the top of my head - untested!) that should get you started.
Const ForReading = 1
Const ForWriting = 2
Dim sNewLine
Set fso = CreateObject("Scripting.FileSystemObject")
Set tsIn = fso.OpenTextFile("OldFile.txt", ForReading) ' Your input file
Set tsOut = fso.OpenTextFile("NewFile.txt", ForWriting) ' New (output) file
While Not tsIn.AtEndOfStream ' While there is still text
sNewLine = tsIn.Read(102) ' Read 120 characters
tsOut.Write sNewLine & vbCrLf ' Write out to new file + CR/LF
Wend ' Loop to repeat
tsIn.Close
tsOut.Close
I won't cover the reading of files, since that is stuff you can find everywhere. And since it's been years I've coded in vb or vbscript, I hope that .net code will suffice.
pseudo: read line from file, put it in for example a string (performance issues anyone?).
A simple algorithm would be and this might have performance issues (multithreading, parallel could be a solution):
Public Sub foo()
Dim strLine As String = "foo²"
Dim strLines As List(Of String) = New List(Of String)
Dim nrChars = strLine.ToCharArray.Count
Dim iterations = nrChars / 102
For i As Integer = 0 To iterations - 1
strLines.Add(strLine.Substring(0, 102))
strLine = strLine.Substring(103)
Next
'save it to file
End Sub

IndexOf Method VB.net

How do I use the Indexof Method to search for an Index a number? The number will be different on each line of the file. Each array has a name and a different zip code. I want to tell it to search for the first number in the line. Everything before that index will be first name, last name, and then zip code.
infile = IO.File.OpenText("Names.txt")
'process the loop instruct until end of file
intSubscript = 0
Do Until infile.Peek = -1
'read a line
strLine(intSubscript) = infile.ReadLine
intSubscript = intSubscript + 1
Loop
infile.Close()
A solution from how I understand this:
Instead of using the IndexOf, you can save each part of the file on a different line (ReadLine).
If you really need the IndexOf: It's just String.IndexOf(EnterCharacterHere)
You could also read this file and only use the numbers found:
First you make a const string const cstrNumbers as string = "0123456789" and then do the following:
For x as integer = 0 to strInput -1
strTemporary = strInput.Substring(x,1)
If InStr(cstrNumbers, strTemporary) <> 0 Then
strOutput &= strTemporary
strOutput will contain the numbers then.
Hope this helps,
Simon
EDIT:
This would be easier with a database, but I have no experience with db in vb.net.
You could do a substring combined with the InStr I mentioned.
First you need a function that will return the first occurrence of a number. (With InStr)
And then use this in the substring (String.SubString(FirstOccurence, LengthOfZip)
Don't have the time to do the complete code now..Hope this helps you a bit