Parsing a text file Row after another row - vb.net

I have this text file that I need to parse and put the parsed data in the database
Name Qty1 Qty2 Name Qty1 Qty2
ABC 1 2
BCD 2 3
EFG 7 9 PQR 56 97
DEF 3 18 RET 988 11
I have a table where I need to put the above data
The table structure is like this
Name, Qty1, Qty2,Col
so If I parse from left side then I can put the ABC,1,2, L in the table and if Parse from right side then I can put PQR, 56, 97, R in the same table.
My problem is how can I differentiate between left column and right columns. As soon as I start reading, I can read ABC,1,2 and then I don't know if there is a value in Right column and if I keep reading through my VB.net code then, I will start reading BCD,2,3 and at that point I don't know if BCD belongs to Right Column or left column so I am not sure whether I put L or R in the database. I am trying to parse this file in .net using substring and Indexof. This file is generated from the pdf document. below is the code to read the pdf document:
Public ReadOnly Property getParsedFile() As String
Get
Dim document As New PDFDocument(filePath)
Dim parsedFile As StringBuilder = New StringBuilder()
For i As Integer = 0 To document.Pages.Count - 1
parsedFile.Append(document.Pages(i).GetText())
Next
Return parsedFile.ToString()
End Get
End Property
any help will be greatly appreciated.
below is the answer
Public Function ExtractTextFromPdf(path As String) As String
Dim its As iTextSharp.text.pdf.parser.ITextExtractionStrategy = New iTextSharp.text.pdf.parser.LocationTextExtractionStrategy()
Using reader As New PdfReader(path)
Dim str As New StringBuilder()
For i As Integer = 1 To reader.NumberOfPages
Dim thePage As String = PdfTextExtractor.GetTextFromPage(reader, i, its)
Dim theLines As String() = thePage.Split(ControlChars.Lf)
For Each theLine As String In theLines
str.AppendLine(theLine)
Next
Next
saveTextFileOnComputer(str.ToString())
Return str.ToString()
End Using
End Function

Related

How to format textfiles values retrieved from a directory and displayed in datagridview in vb.net

The problem now is how would I be able to format the values being displayed in datagridview from textfiles.
I have retrieved values from looping through textfiles removed the first two strings. Now I want to add separators or change the format of the displayed value like, for example:
textfile lines: result:
01Sample - line1
022 - line2
0306212019 - line3 06/21/2019
041234567890 - line4 12,345,678.90
I have already tried this one changing the defaultcellstyle but since the values are from textfiles in a directory its not affecting the output
DataGridView1.Columns("Gross Sales").DefaultCellStyle.Format = "##,0"
Private Sub ReadTextFiles()
Dim dt As New DataTable
dt.Columns.Add("Date")
dt.Columns.Add("Gross Sales")
Dim Folder As New IO.DirectoryInfo("c:\test\")
Dim lstLines As New List(Of String)
For Each fileentries As String In Folder.GetFiles("s*", IO.SearchOption.AllDirectories).OrderByDescending(Function(x) x.Name).Select(Function(x) x.FullName)
lstLines.AddRange(File.ReadAllLines(fileentries))
Next
Dim i As Integer
Dim OuterLoopIterations As Integer = CInt(lstLines.Count / 22)
For iterations = 0 To OuterLoopIterations - 1
Dim row = dt.NewRow
For col = 0 To 21
row(col) = lstLines(i).Remove(0, 2) 'i have removed the first 2 characters of each string
i += 1
Next
dt.Rows.Add(row(2), row(5), row(12), row(13), row(14), row(15), row(7), row(8), row(11))
Next
DataGridView1.DataSource = dt
'the code i tried applying
DataGridView1.Columns("Gross Sales").DefaultCellStyle.Format = "##,0"
this is my expected result
Current datagrid view:
the result should be for date column: 06/07/2019
for the gross : 48,990.14
Edit:
I tried this one
Dim B As Double
Dim Folder As New IO.DirectoryInfo("c:\test\")
Dim lstLines As New List(Of String)
For Each fileentries As String In Folder.GetFiles("s*", IO.SearchOption.AllDirectories).OrderByDescending(Function(x) x.Name).Select(Function(x) x.FullName)
B = CDbl(Val(fileentries))
lstLines.AddRange(File.ReadAllLines(B))
Next
If you want to format something as a number then it has to be a number. That means that, for example, if you read the text "1234.5" from the file and you want to display it as 1,234.50 in your grid then you have to convert the String you read to a Double or Decimal. If you do that then the numeric format specifier you're using in the grid column will work.

vb.net save datagridview to text file with delimiter

how can i save cell values to text file with delimiter
i have 9 rows
and i have this code
Dim newoutputlines As New List(Of String)
Dim finlines As New List(Of String)
Dim aas As String = ""
For x As Integer = 0 To DataGridView1.Rows.Count - 1
For v As Integer = 0 To 9
'extracting cell value from 0 to 9 and add it on list
newoutputlines.Add(DataGridView1.Rows(x).Cells(v).Value)
Next
'adding delimiter to list
aas = String.Join("|", newoutputlines.ToArray())
finlines.add(ass)
Next
IO.File.WriteAllLines(FILE_NAME, finlines.ToArray)
then on my text file i want to save like this format
0|1|2|3|4|5|6|7|8|9 'this is from index 0 of gridview
0|3|0|8|6|5|6|7|8|0 'this is from index 1 of gridview
6|1|2|5|4|5|6|7|5|59 'this is from index 2 of gridview
but im failed
and the result i've got on my text file is like this
0|1|2|3|4|5|6|7|8|9
0|1|2|3|4|5|6|7|8|9|0|3|0|8|6|5|6|7|8|0
0|1|2|3|4|5|6|7|8|9|0|3|0|8|6|5|6|7|8|0|6|1|2|5|4|5|6|7|5|59
Your newoutputlines still have previous values. You need to clear it every time you add new row
But i want to show my approach to save values in the file by using StringBuilder. If you will have a lot of rows then this approach will be more efficient.
Dim text As New StringBuilder()
For x As Integer = 0 To DataGridView1.Rows.Count - 1
For v As Integer = 0 To 9
'extracting cell value from 0 to 9 and add it on list
if v > 0 Then text.Append("|")
text.Append(DataGridView1.Rows(x).Cells(v).Value.ToString())
Next
'adding new line to text
text.AppendLine()
Next
IO.File.WriteAllText(FILE_NAME, text.ToString())

Split up strings into RTF table cells and max total width

VB2010. May be a bit hard to understand but I have a list of classes and one field is a string message. I'm outputting these messages to an RTF document but want to maximize use of horizontal space so am trying to dynamically create a table and fit as many messages in one row as possible and then another row. This while I maintain a max width possible for a row.
Public Class TripInfo
Public SysId As String = ""
Public CreateDate As DateTime
Public OutMessage As String = ""
Public OutMessageWidth As Integer = 0 'the width of the message in char count up to first LF
End Class
Dim myTrips1 as New List(Of TripInfo)
Dim myTrips2 as New List(Of TripInfo)
So as I iterate through the lists I want to create rows that are themselves no longer than 45 characters. Something like:
---------------------------------------------
|"Message1 |"Message2 |"Much longer message |
| Trip1 "| Trip2" | with two lines" |
---------------------------------------------
|"message is even longer than the others" |
---------------------------------------------
|"trip is ok |"trip was cancelled due to CW |
| enroute" | must log to system" |
---------------------------------------------
|"Message3 |"Message4 |"Message5 |"Message6"|
| Trip3 "| Error" | Stop" | |
---------------------------------------------
*Note that the message itself can span more than 1 line with LFs to display a multi-line message
I have scratch code to write the RTF code for the tables and have substituted fake messages with multiple embedded LFs and the output looks good.
Dim sbTable As New StringBuilder
sbTable.Append("\pard \trowd\trql\trgaph108\trleft36\cellx1636\cellx3236\cellx4836\intbl R1C1\cell R1C2\cell R1C3\cell\row \pard")
sbTable.Append("\pard \trowd\trql\trgaph108\trleft36\cellx4642\intbl R1C1\cell\row \pard")
sbTable.Append("\pard \trowd\trql\trgaph0\trleft36\cellx4642\cellx5500\intbl R1C1\cell R1C2\cell\row \pard")
However I just cant seem to get my head around how to even start this to do it dynamically. It seems like I may need to do two iterations. One to break up the messages into rows and then another to actually write the RTF code.
I have so far pseudo code but need some help with my logic.
dim v as integer = 0 'total width of current row
For each t as TripInfo in myTrips1 and myTrips2
if (t.OutMessageWidth added to v) > 45 then
start new row and append
else
append to current row
endif
Next t
Without knowing the properties of your TripInfo class, I'm going to have to make some assumptions. If any property I assume doesn't exist, you can either create it or modify the code to get the same effect.
Dim t As TripInfo, AllTrips As New List(Of TripInfo)
For Each t In myTrips1
AllTrips.Add(t)
Next
For Each t In myTrips2
AllTrips.Add(t)
Next
If AllTrips.Count > 0 Then
For Each t In AllTrips
Dim NewRow() As String = t.Lines
Dim w As Integer = t.OutMessageWidth
Dim h As Integer = t.Lines.Count
For ItemHeight As Integer = h To 1 Step -1
For Each CompareTrip As TripInfo In AllTrips
If AllTrips.IndexOf(t) <> AllTrips.IndexOf(CompareTrip) _
And CompareTrip.Lines.Count = ItemHeight _
And w + CompareTrip.OutMessageWidth <= 45 Then
w += CompareTrip.OutMessageWidth
For l As Integer = 0 To h -1
NewRow(l) = NewRow(l).PadRight(w) & CompareTrip.Lines(l)
Next
AllTrips.Remove(CompareTrip)
End If
Next
Next
AllTrips.Remove(t)
'Write lines of NewRow to your RTF
Next
End If

trying to to get specifically ordered data from .txt and then converting and storing it to double or string arrays at visual basic

I am trying, for several days, to take specifically ordered data from a .txt file and then convert and store it to double or string arrays
the data is stored in the file in this way:
1 0 1 0 >= 15
0 1 0 1 >= 28
1 1 0 0 <= 30
0 0 3 1 <= 22
-1 0 2 0 <= 0
(one line after the other with no blank lines between them)
and my code for this goes like:
Using stream As System.IO.FileStream = System.IO.File.OpenRead("C:\Users\user\Desktop\test_new.txt")
Using reader As New System.IO.StreamReader(stream)
Dim lineCount = File.ReadAllLines("C:\Users\user\Desktop\test_new.txt").Length
Dim line As String = reader.ReadLine()
Dim aryTextFile() As String
Dim operator1() As String
Dim variables(,) As Double
Dim results() As Double
Dim counter3 As Integer
counter3 = 0
NRows = lineCount
While (line IsNot Nothing)
Dim columns = line.Split(" ")
aryTextFile = line.Split(" ")
line = reader.ReadLine()
NVars = columns.Length - 3
For j = 0 To UBound(aryTextFile) - 3
variables(counter3, j) = CDbl(aryTextFile(j))
Next j
For j = NVars To UBound(aryTextFile) - 2
operator1(counter3) = CStr(aryTextFile(j))
Next j
For j = UBound(aryTextFile) - 2 To UBound(aryTextFile) - 1
results(counter3) = CDbl(aryTextFile(j))
Next j
counter3 = counter3 + 1
End While
End Using
End Using
I'm getting warnings which result in errors ofc.
Variable 'variables' is used before it has been assigned a value. A null reference exception could result at
runtime. C:\Users\user\Documents\Visual Studio
2008\Projects\WindowsApplication1\WindowsApplication1\Form1.vb 230 25 WindowsApplication1
Variable 'operator1' is used before it has been assigned a value. A null reference exception could result at
runtime. C:\Users\user\Documents\Visual Studio
2008\Projects\WindowsApplication1\WindowsApplication1\Form1.vb 236 25 WindowsApplication1
Variable 'results' is used before it has been assigned a value. A null reference exception could result at
runtime. C:\Users\user\Documents\Visual Studio
2008\Projects\WindowsApplication1\WindowsApplication1\Form1.vb 243 25 WindowsApplication1
So what am I doing wrong and how can I fix it
note: data is saved from dynamic matrixes, so it can be a lot bigger than the displayed example (several lines with several columns), and that is the reason I'm trying to program it, in order to save me some lab time of copying and pasting it manually...
thanks in advance,
Viktor
p.s. if another member or admin can indicate an older post about my question, that would also be very helpful, but I am reading posts for the last 4 days in similar questions and I couldn't find something working for me
p.s.2 since is my first post, I have also tried to attach the project and I couldn't find a way :)
You need to define the dimensions of your arrays before trying to use them.
If you don't know what the size will be use a list instead.
'No defined size - Warning
Dim array1() As String
'Error when trying to access
array1(4) = "Testing"
'Defined size
Dim array2(10) As String
array2(5) = "Testing"

How to create a csv file from a formatted text file

I have a text file with hundreds of prospects that I'm trying to convert to csv for importing.
Format for whole document.
Prospect Name
Description
Website
Prospect Name
Description
Website
How would I write a vb program to loop through this to make a csv file. Every 4 lines is a new prospect.
Get your file contents as an IEnumerable(of String), and spin through it, adding a CSV row for every record (into a new list(of String)). Finally write the file contents. You'll probably want to add a header row.
Dim lstLines As IEnumerable(Of String) = IO.File.ReadLines("C:\test\ConvertToCSV.txt")
Dim lstNewLines As New List(Of String)
Dim intRecordTracker As Integer = 0
Dim strCSVRow As String = String.Empty
For Each strLine As String In lstLines
If intRecordTracker = 4 Then
intRecordTracker = 0
'Trim off extra comma.
lstNewLines.Add(strCSVRow.Substring(0, (strCSVRow.Length - 1)))
strCSVRow = String.Empty
End If
strCSVRow += strLine & ","
intRecordTracker += 1
Next
'Add the last record.
lstNewLines.Add(strCSVRow.Substring(0, (strCSVRow.Length - 1)))
'Finally write the CSV file.
IO.File.WriteAllLines("C:\Test\ConvertedCSV.csv", lstNewLines)
Dim count As Integer = -1, lstOutput As New List(Of String)
lstOutput.AddRange(From b In File.ReadAllLines("C:\temp\intput.txt").ToList.GroupBy(Function(x) (Math.Max(Threading.Interlocked.Increment(count), count - 1) \ 4)).ToList() Select String.Join(",", b))
File.WriteAllLines("c:\temp\test\output.txt", lstOutput)