How to replace CRLF with a space? - vb.net

How can I parse out undesireable characters from a collection of data?
I am working with existing VB.NET code for a Windows Application that uses StreamWriter and Serializer to output an XML document of transaction data. Code below.
Private TransactionFile As ProjectSchema.TransactionFile
Dim Serializer As New Xml.Serialization.XmlSerializer(GetType (ProjectSchema.TransactionFile))
Dim Writer As TextWriter
Dim FilePath As String
Writer = New StreamWriter(FilePath)
Serializer.Serialize(Writer, TransactionFile)
Writer.Close()
The XML document is being uploaded to another application that does not accept "crlf".
The "TransactionFile" is a collection of data in a Class named ProjectSchema.TransactionFile. It contains various data types.
There are 5 functions to create nodes that contribute to the creation of a Master Transaction file named TransactionFile
I need to find CRLF characters in the collection of data and replace the CRLF characters with a space.
I am able to replace illegal characters at the field level with:
.Name = Regex.Replace((Mid(CustomerName.Name, 1, 30)), "[^A-Za-z0-9\-/]", " ")
But I need to scrub the entire collection of data.
If I try:
TransactionFile = Regex.Replace(TransactionFile, "[^A-Za-z0-9\-/]", " ")
Because TransactionFile cannot be converted to String I get a "Conversion from type 'Transaction' to type 'String' is not valid" message.
Bottom Line = How can I replace CRLF with a space when it shows up in TransactionFile data?

Don't do it this way. Create the serializer with XmlWriter.Create(). Which has an overload that accepts an XmlWriterSettings object. Which has lots of options to format the generated XML. Like NewLineChars, it lets you set the characters to use for a line end.

As Hans says, mess around with the XmlWriterSettings.
The next best choice is to write the file, then read the file into an xml object and process it element by element. This would let you remove crlf from within individual elements, but leave the ones between elements alone, for example.
Another possibility - rather than write directly to the file, you can make an intermediate string, and do a replace in that:
Dim ms As New MemoryStream
Serializer.Serialize(ms, TransactionFile)
ms.Flush()
ms.Position = 0
Dim sr As New StreamReader(ms)
Dim xmlString As String = sr.ReadToEnd
sr.Close() ' also closes underlying memorystream
Then you could do your regex replace on the xmlString before writing it to a file. This should get all the crlf pairs, both within elements and between.

Related

Deleting all lines in text file until you get to a word vb.net

Very new to vb.net, apologies if this is basic. I am trying to open up a text file and delete all the lines starting at index 0 until I hit the line that has the word I am looking for. Right now, it just deletes the word I put in it.
' Read the file line by line
Using reader As New IO.StreamReader(fileName)
While Not reader.EndOfStream()
Dim input As String = reader.ReadLine()
'Delete all lines up to String
Dim i As Integer
i = 0
For i = 0 To input.Contains("{MyWord}")
builder.AppendLine(input)
Next
End While
End Using
Partial. You didn't say what to do with the rest of the lines...
Did you mean lines?
Dim ShouldRead as Boolean
Dim builder As New System.Text.StringBuilder
Using reader As New IO.StreamReader(fileName)
'Delete all lines without String
While Not reader.EndOfStream()
Dim input As String = reader.ReadLine()
If input.Contains("{MyWord}") Then ShouldRead = True
If ShouldRead Then
builder.AppendLine(input)
End If
End While
End Using
I would tend to do it like this:
Dim lines = File.ReadLines(filePath).
SkipWhile(Function(line) Not line.Contains(word)).
ToArray()
File.WriteAllLines(filePath, lines)
The File.ReadLines method reads the lines of the file one by one and exposes them for processing as they are read. That's in contrast to the File.ReadAllLines method, which reads all the lines of the file and returns them in an array, at which case you can do as desired with that array.
The SkipWhile method will skip the items in a list while the specified condition is True and expose the rest of the list, so that code will skip lines while they don't contain the specified word and return the rest, which are then pushed into an array and returned. That array is then written back over the original file.
Just note that String.Contains is case-sensitive. If you're using .NET Core 2.1 or later then there is a case-insensitive overload but older versions would require the use of String.IndexOf for case-insensitivity.

VB.NET: Modifying non-text file as text without ruining it

I need my application to find and modify a text string in a .swp file (generated by VBA for SOLIDWORKS). If I open said file as text in Notepad++, most of the text looks like this (this is an excerpt):
Meaning there is readable text, and symbols that appear as NUL, BEL, EXT and so on, depending on selected encoding. If I make my changes via Notepad++ (finding and changing "1.38" to "1.39"), there are no issues, the file can be opened via SOLIDWORKS and is still recognized as valid. After all, I don't need to modify these non-readable bits. However, if I do the same modification in my VB.NET application,
Dim filePath As String = "D:\OneDrive\Desktop\launcher macro.swp"
Dim fileContents As String = My.Computer.FileSystem.ReadAllText(filePath, Encoding.UTF8).Replace("1.38", "1.39")
My.Computer.FileSystem.WriteAllText(filePath, fileContents, Encoding.UTF8)
then the file gets corrupted, and is no longer recognized by SOLIDWORKS. I suspect this is because ReadAllText and WriteAllText cannot handle whatever data is in these non-readable bits.
I tried many different encodings, but it seems to make no difference. I am not sure how Notepad++ does it, but I can't seem to get the same result in my VB.NET application.
Can someone advise?
Thanks to #jmcilhinney, this is a solution that worked for me - reading file as bytes, converting to string, and then saving, using ANSI formatting:
Dim file_name As String = "D:\OneDrive\Desktop\launcher macro.swp"
Dim fs As New FileStream(file_name, FileMode.Open)
Dim binary_reader As New BinaryReader(fs)
fs.Position = 0
Dim bytes() As Byte = binary_reader.ReadBytes(binary_reader.BaseStream.Length)
Dim fileContents As String = System.Text.Encoding.Default.GetString(bytes)
fileContents = fileContents.Replace("1.38", "1.39")
binary_reader.Close()
fs.Dispose()
System.IO.File.WriteAllText(file_name, fileContents, Encoding.Default)

dispose of IO.File.ReadAllLines

I am reading very large text files (6-10 MB). I am splitting the text files in to multiple new text files. There is common "header" and "footer" in the "read" text file that I will store as variable to be called at later time. I can't figure out how to properly dispose of IO.File.ReadAllLines. I'm concerned this will be held in memory if I don't dispose of it properly.
Text.Dispose or Text.Close isn't valid.
Dim testHeader As String
Dim testSite As String
Dim testStart As String
Dim testStop As String
Dim testTime As String
Dim text() As String = IO.File.ReadAllLines("C:\Users\anobis\Desktop\temp.txt")
testHeader = text(0)
testSite = text(text.Length - 4)
testStart = text(text.Length - 3)
testStop = text(text.Length - 2)
testTime = text(text.Length - 1)
text.dispose()
Later in the program I will be initiating another StreamReader and want to avoid conflicts and memory resource issues. I am new at coding so be gentle! Thanks!
' Open temp.txt with "Using" statement.
Using r As StreamReader = New StreamReader("C:\Users\anobis\Desktop\temp.txt")
' Store contents in this String.
Dim line As String
line = r.ReadLine
' Loop over each line in file, While list is Not Nothing.
Do While (Not line Is Nothing)
If line Like (sourceSN.Text + "*") Then 'Substitute in source serial number "xxxxxx*"
file.WriteLine(line)
End If
' Read in the next line of text file.
line = r.ReadLine
Loop
End Using
file.WriteLine(testSite)
file.WriteLine(testStart)
file.WriteLine(testStop)
file.WriteLine(testTime)
' Close transfer.txt file
file.Close()
You don't need to dispose of it. It returns a managed string array, who's lifetime is managed by the garbage collector. Internally, File.ReadAllLines is disposing of the underlying native file handle it created to read all of the lines for you.

Textfieldparser Delimiters

I'm currently busy coding a hangman game in VB.NET.
As a wordlist, I have a textfile containing 1520 words, each one seperated by a new line...
The best I could think of to get a random word is with a Randomize() function.
Then getting the word from the line # which was randomly generated.
Only to find out just now, that this method:
Using parser As New Microsoft.VisualBasic.FileIO.TextFieldParser_
("filepath")
parser.TextFieldType = FileIO.FieldType.Delimited
doesn't allow me to use a new line as a delimiter...
Considering all words have different lengths/widths, I can't use this either:
parser.TextFieldType = FileIO.FieldType.FixedWidth
Is there any better way for me to extract the word from that random line?
If not, what would be the delimiter I should use for this and how do I quickly change the breaklines into that new delimiter without resorting to Office?
Also, how can I use the textfieldparser to get the file from resources?
When I tried using
my.resources.filename
instead of "filepath", it gave me an ArgumentException due to "invalid characters in the path".
The easier way is to load your text file into a string collection, then grab the random index of the collection
Dim list As New List(Of String)
Dim Reader As New StreamReader("C:\WordList.txt")
Dim line As String
Do
line = Reader.ReadLine()
list.Add(line)
Loop Until line Is Nothing
Reader.Close()
Read all the words into a string array with File.ReadAllLines. One line of code:
Dim words() As String = File.ReadAllLines(path)
To select a random word, use Rnd
Randomize()
Dim randomWord As String = words(CInt(Math.Floor(Rnd * words.Length)))

trying to read a delimited text file from resources - but it wont run

I'm having a problem where instead of reading a text file from the location string, I changed it to read the text file from the resource location and it breaks my program. I've also used the insert snippet method to get most of this code, so it is safe to say I don't know what is going on. Could some one please help?
'reads the text out of a delimited text file and puts the words and hints into to separate arrays
' this works and made the program run
' Dim filename As String = Application.StartupPath + "\ProggramingList.txt"
'this dosnt work and brings back a Illegal characters in path error.
dim filename as string = My.Resources.ProggramingList
Dim fields As String()
'my text files are delimited
Dim delimiter As String = ","
Using parser As New TextFieldParser(filename)
parser.SetDelimiters(delimiter)
While Not parser.EndOfData
' Read in the fields for the current line
fields = parser.ReadFields()
' Add code here to use data in fields variable.
'put the result into two arrays (the fields are the arrays im talking about). one holds the words, and one holds the corresponding hint
Programingwords(counter) = Strings.UCase(fields(0))
counter += 1
'this is where the hint is at
Programingwords(counter) = (fields(1))
counter += 1
End While
End Using
the error
ex.ToString()
"System.ArgumentException: Illegal characters in path.
at System.IO.Path.CheckInvalidPathChars(String path)
at System.IO.Path.NormalizePathFast(String path, Boolean fullCheck)
at System.IO.Path.NormalizePath(String path, Boolean fullCheck)
at System.IO.Path.GetFullPathInternal(String path)
at System.IO.Path.GetFullPath(String path)
at Microsoft.VisualBasic.FileIO.FileSystem.NormalizePath(String Path)
at Microsoft.VisualBasic.FileIO.TextFieldParser.ValidatePath(String path)
at Microsoft.VisualBasic.FileIO.TextFieldParser.InitializeFromPath(String path, Encoding defaultEncoding, Boolean detectEncoding)
at Microsoft.VisualBasic.FileIO.TextFieldParser..ctor(String path)
at HangMan.Form1.GetWords() in I:\vb\HangMan\HangMan\Form1.vb:line 274" String
The TextFieldParser constructor you use expects the name of a file. Instead, it gets the contents of the file. That goes Kaboom, the file content is not a valid path to a file. You'll need to the constructor that takes a Stream and use the StringReader class to provide the stream. For example:
Dim fields As String()
Dim delimiter As String = ","
Dim fileContent As String = My.Resources.ProggramingList
Dim stringStream as New System.IO.StringReader(fileContent)
Using parser As New TextFieldParser(stringStream)
REM etc...
End Using
This is a bit wasteful of memory but not an issue if the text is less than a megabyte or so. If it is more then you shouldn't put it in a resource.
When you debug this code, what is the value of the variable filename after you read it from My.Resources.GamesList? Is it a valid string, does it point to you're file?