VB.NET - Split text doc using blank lines - vb.net

I have a text doc with multiple lines but each "subject" is separated by a blank line.. like..
Block 1
Line 1
Line 2
Block 2
Line 1
Line 2
And so on. I have tried lots of variants using vbcr and the like.. but can't get each block to be separated by the "blank lines". The goal is to use each block's data individually.
Any help or direction would be greatly appreciated. Thanks in advance.

I'm a little unclear on what you're trying to do: are you trying to parse each "block" of data separately, and need to recognize a blank line as a blank line?
If that is the case, you could read each line as follows:
Dim objReader As New System.IO.StreamReader(FILE_NAME)
Dim tempString As String = ""
Do While objReader.Peek() <> -1
tempString = objReader.ReadLine().Trim()
if tempString.equals("") Then ' we have a blank line ...
Else
' Do something else with the tempstring line
End If
Loop
There may be more sophisticated ways to do this, but this is what I'd do.

Try regular Expressions.
Import System.Text.RegularExpressions 'may be needed in your file to use below code...
Dim blocks() As String = Regex.Split(myData, "\n[ \t]*\n")
' regex looks for the occurrence of an enter char "\n" following by an optional amount of whitespace "[ \t]*" following by another enter character
You may also need to intermix some "\r" in that regex as a "carriage return \r" and a "new line \n" are sometimes mixed in different ways in data.
"\r\n" = vbCrLf
"\r" = vbCr
"\n" = vbLf

Dim subjectsWithLines() as string=split(stringThatYouReadFromFile,chr(10))
Now there are different kinds of BLANK lines, if chr(10) doesn't work then try using chr(13) or Environment.newline
ChicagoMike's answer works too, but due different kind of "BLANK LINES", use
tempString.Count<1 instead equals

Related

Minify HTML code into one line

My program is generating HTML code, which is placed afterwards into some string variable. This HTML code is ready to be placed in CSV file, so the entire code is surrounded by quotes, as well as all inner double quotes have additional quotes for escape. The result is the user can see it nicely formatted in output.
However, I have to convert this code to one line as assuming that my excel having trouble with this 'well formatted' HTML code as there are line-breaks. Therefore, I want before I place into CSV to make this HTML code into one line. Can you tell me how to achieve that?
You can replace line breaks by nothing to get a single lined output:
Dim TestString As String
TestString = " <body>" & vbCrLf & " some html" & vbCrLf & " </body>"
Dim SingleLined As String
SingleLined = Replace(TestString, vbCrLf, "")

I printed text that contains multiple "\"'s but "\S" turned into a "1"

So I have a C1TrueDBGrid on my form (which is a ComponentOne control), and I give the user the option to print the contents of the grid.
When printed, I include a header with some text. This is my code for printing:
Dim dlgPrint As New PrintDialog
dlgPrint.ShowDialog()
dgvList.PrintInfo.PrintEmptyGrid = False
dgvList.PrintInfo.PageHeader = txtDirectory.Text & Environment.NewLine & "Search Term: " & txtSearch.Text & Environment.NewLine
dgvList.PrintInfo.PageSettings.Landscape = True
dgvList.PrintInfo.WrapText = C1.Win.C1TrueDBGrid.PrintInfo.WrapTextEnum.Wrap
dgvList.PrintInfo.RepeatColumnHeaders = True
dgvList.PrintInfo.Print(dlgPrint.PrinterSettings)
dlgPrint.Dispose()
txtDirectory.Text as I'm sure you can imagine contains the path for a directory, which includes back-slashes \ . What actually got printed turned the instances of \S into 1.
For example: txtDirectory.Text = \\Server02\Users\Me\J\Star
page that got printed = \1erver02\Users\Me\J1tar
Is "\S" a printer command for "1" or something? Is there a list somewhere of what all such commands are, if that's the case? Either way, how do I get it to print the actual text?
Thank you!
You are setting that text to a PageHeader, and according to ComponentOne, \S is a special character that returns the total number of sub-pages, or "1" in your example. You will need to double-escape any of the characters in the list on that page.
Updates have been posted to this ComponentOne forum thread.
So what I did was to simply assign the string I want to print to a variable printText and then replace those special characters accordingly:
printText.Replace("\t", "\\t")
printText.Replace("\p", "\\p")
printText.Replace("\P", "\\P")
printText.Replace("\g", "\\g")
printText.Replace("\G", "\\G")
printText.Replace("\s", "\\s")
printText.Replace("\S", "\\S")
Just note that the "\\t" is not yet working like the others...they are looking into it.
Thanks #DonBoitnott for the original link!

VB.Net Writing to Txt File

I'm trying to write the content of my textbox to a txt file.
My code works fine but my error is, when I open txt file I see
writeline1writeline2writeline3
instead of
writeline1
writeline2
writeline3
my code;
result As List(Of String) = New List(Of String)
convertedText.Lines = result.ToArray()
My.Computer.FileSystem.WriteAllText(mypath & "\convertedcontent.txt", convertedText.Text, False)
Writing to .csv and many other file types work fine but I don't know how to break lines for text file.
Thanks in advance
I would use System.IO.File.WriteAllLines:
Dim path = System.IO.Path.Combine(mypath, "convertedcontent.txt")
System.IO.File.WriteAllLines(path, result)
Otherwise you need to append Environment.NewLine to each line, you can use String.Join:
System.IO.File.WriteAllText(path, String.Join(Environment.NewLine, result))
You need to add & vbCrLf to your strings (each line)
Not sure where you are getting your strings from.. but you will have to add the carrier return/Line Feed character to those strings, one at the end of every string.
Might just even loop through your array and add them there?
P.S. Some of the comments have quicker ways of getting there, but this is probably what happens behind the scenes...
for i = 0 to convertedText.Lines.count -1
convertedText.Lines(i) += vbCrLf
next

Avoid extra "carriage return" in Print statement with Visual Basic?

With Visual Basic, I'm confused with that the behavior of Print statement in that sometimes the following statement: would cause additional carriage return "^M" at the end of a line, but sometimes, it doesn't. I wondering why?
filePath = "d:\tmp\FAE-IMM-Report-2012-Week.org"
If Dir(filePath) <> "" Then
Kill filePath
End If
outFile = FreeFile()
Open filePath For Output As outFile
Print #outFile, "#+TITLE: Weekly Report"
would produce
#+TITLE: Weekly Report^M
while I wish without ^M:
#+TITLE: Weekly Report
In one test program, almost the same code would produce no "^M".
Please help! Thanks a lot.
Upon further experiment, I found that the following suggestion using vbNewline and ";" at the end of print content, still does not solve my problem.
After careful isolation, I found the cause of the problem is an character that seems like a space, not exactly space, followed by newline and carriage return. Before printing the text containing the offending string, there was no carriage return, but once the offending line is printed, then every line including the previous line printed would have carriage return.
I'm not sure what the exact the offending string is as my skill of VBA is not yet too well.
Here is a copy of the offending text from a spreadsheet cell:
"There is something invisible after this visible text
After the invisible text, then there might be a carriage return $Chr(13) and/or newline"
I'm not sure if the paste to web browser would preserve the content, though. By pasting to emacs, I did not see carriage return, while emacs should display it, if there is one. So I guess that there is no carriage return in the offending string.
Below is the program demonstrate the problem:
Sub DemoCarriageReturnWillAppear()
Dim filePath As String
Dim outFile
Dim offendingText
filePath = "d:\tmp\demoCarriageReturn.org"
If Dir(filePath) <> "" Then
Kill filePath
End If
outFile = FreeFile()
Open filePath For Output As outFile
Print #outFile, "#+AUTHOR: Yu Shen" & vbNewLine;
Close #outFile 'At this moment, there is no carriage return
Open filePath For Append As outFile
offendingText = ThisWorkbook.Worksheets("Sheet1").Range("A1")
Print #outFile, offendingText & vbNewLine;
Close #outFile 'Now, every line end has carriage return.
'It must be caused by something offending at the above print out content.
End Sub
Here is the final result of the above procedure:
#+AUTHOR: Yu Shen^M
There is something invisible after this visible text
After the invisible text, then there might be a carriage return $Chr(13) or newline^M
Note the above "^M" is added by me, as carriage return would not be visible in browser.
If you're interested, I can send you the excel file with the offending content.
I need your help on how to avoid those offending string, or the carriage returns.
(I even try to do string Replace of the carriage return or new line, as I found that once I manually deleted whatever caused change to another line, the problem would be gone. But calling Replace to replace vbNewline, Chr$(13), or vbCrLf did not make any difference.
Thanks for your further help!
Yu
Use a trailing semicolon to surpress the new line:
Print #outFile, "#+TITLE: Weekly Report";
^
^
The VB Editor will often add a semicolon if you make a mistake in the statement which could explain why the new line is sometimes output and sometimes not.
New diagnostic routine
We need to know the character within cell A1 that is causing the problem.
Place the following subroutine within one of your modules.
Public Sub DsplInHex(Stg As String)
Dim Pos As Long
For Pos = 1 To Len(Stg)
Debug.Print Hex(AscW(Mid(Stg, Pos, 1))) & " ";
Next
Debug.Print
End Sub
Go to VB Editor's Immediate window and type in the following text following by Return:
DsplInHex(Sheets("Sheet1").range("A1"))
Underneath this line, you should see something like 54 65 73 74 31. This is a list of the code value of each character in the cell. I expect we will see A, the code for line feed, or D, the code for carriage return, at the end of the list.
Position the cursor in cell A1. Click F2 to select edit then Backspace to delete the invisible trailing character then Return to end the edit. Go back to the Immediate Window, position the cursor to the end of DsplInHex(Sheets("Sheet1").range("A1")) and click Return. The trailing character should have gone.
Try that and report back. Good luck.
To help the other people in the future, here is an summary of my problem and the solution. The extra carriage return on each line even with semi-colon at the print statement end was actually caused by a string of space followed by newline (Chr$(A)) in one of the print statement, once such string is printed, then all previous and subsequent printed content would have an extra carriage return!
It seems a bug on VBA 6 (with Excel 2007), a nasty one!
My work-around was to replace the newline by a space.
Thanks for Tony's repeated help enabling me finally nailed down the cause.
Here is the code to demonstrate the problem:
Sub DemoCarriageReturnWillAppearOnAllLines()
Dim filePath As String
Dim outFile
Dim offendingText
filePath = "d:\tmp\demoCarriageReturn.org"
If Dir(filePath) <> "" Then
Kill filePath
End If
outFile = FreeFile()
Open filePath For Output As outFile
Print #outFile, "#+AUTHOR: Yu Shen" & vbNewLine;
Close #outFile 'At this moment, there is no carriage return
Open filePath For Append As outFile
offendingText = " " & Chr$(10)
Print #outFile, offendingText & vbNewLine;
Close #outFile 'Now, every line end has carriage return.
'It must be caused by the offending at the above print out content.
End Sub
After the first "Close #outFile", here is the content of the file demoCarriageReturn.org:
#+AUTHOR: Yu Shen
Note: with editor capable showing carriage return as visible ^M, there is no carriage return present.
However, after the second "Close #outFile", here is the content of the same file with additional content:
#+AUTHOR: Yu Shen^M
^M
Note: there are two carriage returns appear. They are not intended. Especially, to the first line, the print statement has been executed, and at the previous close statement, it was found without carriage return. (To illustrate carriage return, I have to typing ^M in web page here. But it's in the file of the print out.)
This is why I think that it's a bug, as the carriage returns are not intended. It's undesirable surprise.
The following code shows that if I filter out the linefeed character the problem would be gone.
Sub DemoCarriageReturnWillNotAppearAtAll()
Dim filePath As String
Dim outFile
Dim offendingText
filePath = "d:\tmp\demoCarriageReturn.org"
If Dir(filePath) <> "" Then
Kill filePath
End If
outFile = FreeFile()
Open filePath For Output As outFile
Print #outFile, "#+AUTHOR: Yu Shen" & vbNewLine;
Close #outFile 'At this moment, there is no carriage return
Open filePath For Append As outFile
offendingText = " " & Chr$(10)
Print #outFile, Replace(offendingText, Chr$(10), "") & vbNewLine;
Close #outFile 'Now, no more carriage return.
'The only change is removing the linefeed character in the second print statement
End Sub
After full execution of the above program, there is indeed no carriage return!
#+AUTHOR: Yu Shen
This shows that string combination of space followed by linefeed caused the bug, and removing linefeed can avoid the bug.
The following code further demonstrate that if there is no offending string, even without newline and semi-colon at the end of print statement, there would not be undesired carriage return!
Sub DemoCarriageReturnWillNotAppearAtAllEvenWithoutNewLineFollowedBySemiColon()
Dim filePath As String
Dim outFile
Dim offendingText
filePath = "d:\tmp\demoCarriageReturn.org"
If Dir(filePath) <> "" Then
Kill filePath
End If
outFile = FreeFile()
Open filePath For Output As outFile
Print #outFile, "#+AUTHOR: Yu Shen"
Close #outFile 'At this moment, there is no carriage return
Open filePath For Append As outFile
offendingText = " " & Chr$(10)
Print #outFile, Replace(offendingText, Chr$(10), "")
Close #outFile 'Now, no more carriage return.
'The real change is removing the linefeed character in the second print statement
End Sub
Also in the output result:
#+AUTHOR: Yu Shen
Still no the annoying carriage return!
This shows that using newline followed by semi-colon at the end of print statement is not the solution to the problem of carriage return at every line! The real solution is to avoid any string of space followed by linefeed in the print out content.
Yu

Creating a loop in StringBuilder to alter a text file

I would be grateful with some help with reading a text file into a Richtext box. The code I have at present appends the first line of text as I want it but the rest of the lines of text do not alter. I need a loop to read to the end of file and display in Richtext box. the code i have at present is this:-
Dim FILE_NAME As String = "C:\Test.txt"
Dim sr As New System.IO.StreamReader(FILE_NAME)
RichTextBox1.Text = sr.ReadToEnd
Dim sb As New System.Text.StringBuilder(RichTextBox1.Text)
sb.Insert(5, " ")
sb.Insert(12, " ")
sb.Insert(18, " ")
sb.Insert(25, " ")
sb.Insert(29, " ")
sb.Insert(32, " ")
sb.Insert(37, " ")
sb.Insert(44, " ")
sb.Insert(45, " ")
RichTextBox2.Text = sb.ToString
sr.Close()
I think you just want RichTextBox1.LoadFile "C:/test.txt"
that should be a backslash in the file name but my keyboard doesn't have one on this pc
The reason for the spaces is because each line of text that have the same length characters with spaces needs to be seperated to make it more readable.The original text looks like this:-
17915WHITE BLUE 001.900116A T123456111
72451BLACK ORANGE000.500208 B A123456123 'worst case
72455BLACK WHITE 002.703501 C123456124
Needs to look like below.
17915:WHITE BLUE :001.9:001:16:A :T:123456:111
72451:BLACK ORANGE:000.5:002:08: B :A:123456:123
72455:BLACK WHITE :002.7:035:01: :C:123456:124
I can produce the first line to a text file but i cannot reproduce the rest of the lines of text i think i need a loop to keep reading over the text file until the file is read.