Replacing 2x vbCrLf at once - vb.net

I have a string in which I'm trying to replace all VbCr / VbLf with VbCrLf. This is in an attempt to scrape some HTML.
My code looks like this:
leHTML = leHTML.Replace(vbLf, vbCrLf)
leHTML = leHTML.Replace(vbCr, vbCrLf)
However in many cases I'm then left with 2x vbCrLf of which I only want 1.
leHTML = leHTML.Replace(vbCrLf & vbCrLf, vbCrLf)
The line above doesn't seem to be doing anything. How can I replace 2x vbCrLf with 1x vbCrLf? Is there a better way of going about "normalizing" Line Feeds and Carriage Returns?

You should not replace a correct vbCrLf in the first place. Instead replace only those characters where replacement is necessary. A handy tool for this task is a regular expression.
There are two cases that you want to get rid off:
vbCr with no following vbLf
the Regex for this is (vbCr)(?!vbLf)
vbLf with no preceeding vbCr
the Regex for this is (?<!vbCr)(vbLf)
Putting this together, we get the following regex:
Dim regex = New Regex("((" & vbCr & ")(?!" & vbLf & ")|(?<!" & vbCr & ")(" & vbLf & "))")
Throw this on your input and you're done:
leHTML = regex.Replace(leHTML, vbCrLf)
Here is a simple test program (vbCr and vbLf have been replaced by cr and lf respectively, so there is a visible output):
Dim str = "crlf cr cr lf crlf lf"
Dim regex = New Regex("((cr)(?!lf)|(?<!cr)(lf))")
str = regex.Replace(str, "crlf")
Console.WriteLine(str)
The result is:
crlf crlf crlf crlf crlf crlf

You're going to have to work a little harder at this. Instead of blindly replacing characters, you need to see what is there first, then determine what you are replacing. For example (this is NOT the complete code):
if leHTML.contains(vbcr) andalso leHTML.contains(vblf) then
leHTML = leHTML.Replace(vbCr & vbLf, vbCrLf)
elseif leHTML.contains(vbcr) then
leHTML = leHTML.Replace(vbCr, vbCrLf)
elseif leHTML.contains(vblf) then
leHTML = leHTML.Replace(vbLf, vbCrLf)
else
...
end

Probably this is a good pattern to use a Regex replace expression.
For example
Dim pattern = "(\r|\n)"
Dim search = "The" & vbCr & "Test string" & vbCr & _
"used as an" & vbLf & "Example" & vbCrLf & "."
Dim m = Regex.Replace(search, pattern, vbCrLf)
Console.WriteLine(m)
The first line prepare the pattern to search for using the C# syntax for vbCr=\r and vbLf=\n enclosing the two characters in an optional group (find a vbCr or a vbLf).
Then the replace method search one or the other char and replace it with the double vbCrLf character sequence.
But now we have a problem, the single vbCrLf present in the test string has been doubled, so you need another replace to remove the double sequence with just one vbCrLf
pattern = "\r\n\r\n"
m = Regex.Replace(search, pattern, vbCrLf)
Console.WriteLine(m)

Related

VB Remove blocks of text in textfile

Recently switched to vb after my time in C# and simple question using sytemIO. My predecessor wrote a package that generated error logs to a text file. The following is a sample:
2017-10-20 15:30:11.481
CmsMonitorService.exe, CmsMonitorService.UpdateCmsOffLine
OffLineUpdater error: Getting list of files stored on the off-line vault.
------------------------------
2017-10-20 15:31:11.547
CmsMonitorService.exe, CmsMonitorService.UpdateCmsOffLine
OffLineUpdater error: Creating folder 'OffLineUpdates' (it may already exist checkHost)
at CmsMonitorService.CmsMonitorService.UpdateCmsOffLine(Object[] Args)
------------------------------
2017-10-20 15:31:11.547
CmsMonitorService.exe, CmsMonitorService.UpdateCmsOffLine
OffLineUpdater error: Creating folder
------------------------------
But this is killing the machines. What the code currently does when it is writing is to removed the contents line by line which is painfully slow. It uses the following:
Do
If allLines.Count = 0 Then
Exit Do
ElseIf allLines(0).StartsWith("-----") Then
allLines.RemoveAt(0)
Exit Do
Else
allLines.RemoveAt(0)
End If
Loop
There can be thousands of these (they are at various locations).
What I had wanted to do is find a way of removing the blocks bewteen the dashes.
Thanks for any ideas everyone.....
Gareth
Here's a quick example of the solution I described in my comment above:
Imports System.Text
Module Module1
Sub Main()
Dim s = "keep line 1" & Environment.NewLine &
"keep line 2" & Environment.NewLine &
"----------" & Environment.NewLine &
"remove line 1" & Environment.NewLine &
"remove line 2" & Environment.NewLine &
"----------" & Environment.NewLine &
"keep line 3" & Environment.NewLine &
"keep line 4" & Environment.NewLine &
"----------" & Environment.NewLine &
"remove line 3" & Environment.NewLine &
"remove line 4" & Environment.NewLine &
"----------" & Environment.NewLine &
"keep line 5" & Environment.NewLine &
"keep line 6" & Environment.NewLine
Dim sb As New StringBuilder(s)
Dim endIndex = s.LastIndexOf("----------")
Do While endIndex <> -1
Dim startIndex = s.LastIndexOf("----------", endIndex - 1)
Dim substring = s.Substring(startIndex, endIndex - startIndex + 12) 'Add the length of the delimiter and the line break.
'Remove the delimited block from the StringBuilder.
sb.Replace(substring, String.Empty, startIndex, substring.Length)
endIndex = s.LastIndexOf("----------", startIndex - 1)
Loop
Console.WriteLine("Before:")
Console.WriteLine(s)
Console.WriteLine("After:")
Console.WriteLine(sb.ToString())
Console.ReadLine()
End Sub
End Module
It might depend on the specifics of the text as to whether that is more efficient or just using a String alone is.

How Can I join a String "" with a Value in VBA powerpoint

So I have this textbox and a string I want to set a label to the string and the value combined like this:
Lebel1.Caption = "Hello" Textbox1.Text
I don't know what the proper code is.
I guess you are having a Typo issue. Not sure if your control is Lebel or Label.
In this example you can combine a predefined Text and a Control text or Variable. Don't forget to add an space after your words.
Label1.Caption = "Hello " & Textbox1.Text & "."
Label1.Caption = "Hello " & Your variable & "."
Also you can add a line break and also use a Message Box as follows:
Msgbox = "Welcome Message." & Chr(10) & "Hello " & Textbox1.Text & "."

Find corrupt lines in textfiles and write them behind the line above

I have around 400 textfiles with circa 41000 corrupt lines.
I am searching for an option (VBA maybe?) which searches for these corrupt lines and basically executes a backspace, so that the corrupt lines are written behind the line before, because the corruption is caused by an unwanted wordwrap. The indicator for corrupt lines is that they don't start with the letters TEQ.
Has anyone any idea how and where to build a script like that? Search and replace does not work since i cant but a backspace in the replace field obviously.
Thanks in advance!
EDIT:
An example of a corrupted line:
TEQ;231232;OFNENJD;29840389;TPOS;
TEQ;54111232;O2D;29829;
TPOS;
Line 3 is the corrupted one since it belongs to line 2 but there was a wordwrap. I need to execute a backspace to get it back behind line 2. That's what i'd like to have automated.
To isolate the bad end-of-lines, first convert the good end-of-lines to an abstract. You can then remove the vbCrLF or vbLf which will have the effect of backspacing them away. The last step would be to restore the good end-of-lines by reversing the abstract.
dim str as string
'use your favorite method to read the TXT file into the str variable
str = Replace(str, chr(59) & vbCrLf & "TEQ;", chrw(8203)) 'convert good eol to unicode zero-length space
str = Replace(str, vbLf, vbNullString) 'remove bad eols
str = Replace(str, chrw(8203), chr(59) & vbCrLf & "TEQ;") 'revert back to good eol
'write the str back to the TXT file
It wouldn't be a bad idea to throw a few of the .TXT files into a hex editor to determine whether the bad end-of-lines are created with vbCrLf (Chr(13) & Chr(10)) or just vbLf (Chr(10)). Same with the good end-of-lines although I suspect the good ones will be vbCrLF and the bad ones just vbLf.
The following Sub procedure requires that you go into the VBE's Tools ► References and add Microsoft Scripting Runtime to the project.
Sub fix_TEQ_text()
Dim str As String, fp As String, fn As String
Dim fso As New FileSystemObject, ts As TextStream
fp = Environ("TEMP")
fn = Dir(fp & Chr(92) & "TEQ*.txt", vbNormal)
Do While CBool(Len(fn))
If Not CBool(InStr(1, fn, "_fixed", vbTextCompare)) Then
Set ts = fso.OpenTextFile(fp & Chr(92) & fn, ForReading)
str = ts.ReadAll
ts.Close
str = Replace(str, Chr(59) & vbCrLf & "TEQ;", ChrW(8203)) 'convert good eol to unicode zero-length space
str = Replace(str, vbLf, vbNullString) 'remove bad eols
str = Replace(str, ChrW(8203), Chr(59) & vbCrLf & "TEQ;") 'revert back to good eol
Set ts = fso.CreateTextFile(fp & Chr(92) & Replace(fn, ".txt", "_fixed.txt"), True)
ts.Write str
ts.Close
End If
fn = Dir
Loop
End Sub
You will want to change the file path (e.g. fp) and the file mask (currently "TEQ*.txt" which matched my sample TXT files).

CSV in visual Basic

I am trying to seperate the text box list I have using csv. I am saving it to excel and the titles that I have go into their individual cell, but the text boxes go into one. I want them to be in their seperate cell also.
Also, how can I add new information without overwriting the previous info saved?
Thanks
Dim csvFile As String = My.Application.Info.DirectoryPath & "\HoseData.csv"
Dim outFile As IO.StreamWriter = My.Computer.FileSystem.OpenTextFileWriter(csvFile, False)
outFile.WriteLine("job number, sales order number, date")
outFile.WriteLine(TextBox1.Tex & TextBox2.Text & DateTimePicker1.Text)
outFile.Close()
Console.WriteLine(My.Computer.FileSystem.ReadAllText(csvFile))
You need to add commas in your output:
outFile.WriteLine(TextBox1.Text & "," & TextBox2.Text & "," & DateTimePicker1.Text)
Per the additional requirement of quotes around the DateTimePicker data that was fleshed out in the comments below:
outFile.WriteLine(TextBox1.Text & "," & TextBox2.Text & "," & """" & DateTimePicker1.Text & """")
To append instead of overwrite, as Plutonix mentioned above, use
OpenTextFileWriter(csvFile, True)

Write text to log file Columns

How do I write to the columns in a .log file?
I.e there are columns for "Log Text","Component","Date/Time"
How do I specify these when writing to a file?
I've got half of it working:
dim str As String ="<![LOG[" & message & "]LOG]!><time=""" & Now.ToLongTimeString & """" & " date=""" & Now.ToShortDateString & """ component=""" & component.ToString & """" & " type=""1""" & " Thread=""" & t & """" & ">"
File.AppendAllText(logfile, str & vbCrLf)
But the component, date / time and thread values arent displaying properly.
What am I missing ?
*edited
The file path is "C:\Programdata\server.log"
So some of the text is getting into it in the right place, just not all of it.
So the log text column will get populated with "message" and thread comes in with the number but the date/time and component are empty.
I'd attach a pic but i dont have enough rep :/
In a sentance, Im trying to replicate this:
http://www.jetico.com/web_help/bcwipe6_enterprise/img/log_viewer.jpg
but not all of my columns are displaying data.
Try dividing your data into columns using commas as suggested by #Blackwood, but use string.format as you seem to have lots of extra "" in there
Dim str As String = String.Format("![LOG[{0}]LOG]!,time={1}, <date= {2}, component= {3}, type=1, Thread={4}", Message, Now.ToLongTimeString, Now.ToLongDateString, component.ToString, t)
File.AppendAllText(logfile, Str() & vbCrLf)
I'm not sure what you are doing with the <> tags.
Solved.
This is to do with the format of the data I was trying to write.
Time has to be in the format
Now.ToLongTimeString & "." & Now.Millisecond & "-60"
and date has to be seperated with - instead of /
Dunno why - when viewed the time doesnt go to that length and the date is displayed with "/"
The viewer im using for the log was cmtrace.
This was the line that got it working was ..
<![LOG[" & message & "]LOG]!><time=""" & Now.ToLongTimeString & "." & Now.Millisecond & "-60" & """" & " date=""" & d & """ component=""" & component.ToString & """" & " type=""1" & """ Thread=""" & t &""">