Malformed CSV at end - vb.net

Hey all i am trying to figure out a way of correcting the error in my CSV file before it errors out with a MalformedLineException.
My code is this:
Using myreader As New Microsoft.VisualBasic.FileIO.TextFieldParser("c:\temp.csv")
myreader.TextFieldType = Microsoft.VisualBasic.FileIO.FieldType.Delimited
myreader.Delimiters = New String() {",", "\n"}
myreader.HasFieldsEnclosedInQuotes = True 'Added
While Not myreader.EndOfData
Try
currentrow = myreader.ReadFields()
The error is on the currentrow = myreader.ReadFields(). It's caused by not having the end quote in the last line of the CSV:
"xx.xxx.xxx.xx","2012-05-15 13:15:54","Bob Barker","bbarker#priceisright.com","
It should read:
"xx.xxx.xxx.xx","2012-05-15 13:15:54","Bob Barker","bbarker#priceisright.com",""
How can i correct this before it gets to the line currentrow = myreader.ReadFields()?

You can use File.AppendAllText to add the quote:
File.AppendAllText(filePath, """")

Related

FileIO.TextFieldParser get unaltered row for reporting on failed parse

I want to output the current row if there is an error but I'm getting a message that the current record is nothing.
Here is my code:
Dim currentRow As String()
Using MyReader As New FileIO.TextFieldParser(filenametoimport)
MyReader.TextFieldType = FileIO.FieldType.Delimited
MyReader.SetDelimiters(",")
While Not MyReader.EndOfData
Try
currentRow = MyReader.ReadFields()
ImportLine(currentRow)
Catch ex As FileIO.MalformedLineException
report.AppendLine()
report.AppendLine($"[{currentrow}]")
report.AppendLine("- record is malformed and will be skipped. ")
Continue While
End Try
End While
end Using
I need to output the currentrow so that I can report to the user that there was a bad record.
report.AppendLine($"[{currentrow}]")
I understand that the value would be null if the parse failed but is there a way to get the current record?
How do I output this record if it failed to parse the record?
Thanks for the assistance!
You can't get the raw data directly in the exception, but you can at least get the line number where the error occurred. You may be able to use that line number to go back and find the offending record:
Dim currentRow As String()
Using MyReader As New FileIO.TextFieldParser(filenametoimport)
MyReader.TextFieldType = FileIO.FieldType.Delimited
MyReader.SetDelimiters(",")
While Not MyReader.EndOfData
Try
currentRow = MyReader.ReadFields()
ImportLine(currentRow)
Catch ex As FileIO.MalformedLineException
report.AppendLine($"{vbCrLf}- record at line {ex.LineNumber} is malformed and will be skipped. ")
End Try
End While
End Using
TextFieldParser also provides access to the underlying stream, and provides a ReadLine() method, so if you're really desparate to write the code you could walk back the stream to the previous line ending and then call MyReader.ReadLine() to get the record (which would in turn advance the stream again to where you expect).
I did not get a compile error on MyReader.SetDelimiters(",") but I changed it to an array anyway. The report.AppendLine($"[{currentrow}]") line probably doesn't expect an array. That line I altered to provide a string.
Private Sub Button1_Click(sender As Object, e As EventArgs) Handles Button1.Click
Dim currentRow As String() = Nothing
Using MyReader As New FileIO.TextFieldParser("filenametoimport")
MyReader.TextFieldType = FileIO.FieldType.Delimited
MyReader.SetDelimiters({","})
While Not MyReader.EndOfData
Try
currentRow = MyReader.ReadFields()
ImportLine(currentRow)
Catch ex As FileIO.MalformedLineException
report.AppendLine()
report.AppendLine($"[{String.Join(",", currentRow)}]")
report.AppendLine("- record is malformed and will be skipped. ")
Continue While
End Try
End While
End Using
End Sub
EDIT
As per comments by # Joel Coehoorn and # ErocM if the row is null you could provide the content of the previous row so they errant row could be located.
Dim LastGoodRow As String()
While Not MyReader.EndOfData
Try
currentRow = MyReader.ReadFields()
ImportLine(currentRow)
LastGoodRow = currentRow
Catch ex As FileIO.MalformedLineException
report.AppendLine()
report.AppendLine($"[{String.Join(",", LastGoodRow)}]")
report.AppendLine("- record following this row is malformed and will be skipped. ")
Continue While
End Try
End While

How to eliminate some rows and columns from a CSV file and save to new CSV?

I have a csv file, where the first 3 rows have unwanted data. The 4th row has needed data in the first column only. There are 4 more rows with unwanted data. Rows 9 through the end have needed data. Starting with row 9 there are 11 columns of data, columns 1 through 6 are needed, columns 7 through 11 are unwanted.
I have code that uses a DataGridView for temporary storage. It provides the parsing described above, however I don't need to view the data, I need to create a new CSV file resulting from the parsing.
There is probably a method using a data table for temporary storage, instead of the DataGridView, however maybe there is a simpler way using LINQ. I have no experience with LINQ and my experience with data tables is very limited. I am very comfortable with DataGridView since I use it extensively, but as I wrote earlier I don't need to display the result.
I tried the code in: https://www.codeproject.com/questions/634373/how-to-delete-the-rows-in-csv-file. But it doesn't fit my situation. The code below works using a DataGridView for temporary storage but I am sure there is a better way.
Using MyReader As New TextFieldParser(racerFile)
Dim currentRow As String()
MyReader.TextFieldType = Microsoft.VisualBasic.FileIO.FieldType.Delimited
MyReader.Delimiters = New String() {","}
currentRow = MyReader.ReadFields()
currentRow = MyReader.ReadFields()
currentRow = MyReader.ReadFields()
boatClass = MyReader.ReadFields()(0)
currentRow = MyReader.ReadFields()
currentRow = MyReader.ReadFields()
currentRow = MyReader.ReadFields()
currentRow = MyReader.ReadFields()
While Not MyReader.EndOfData
Try
Dgvs.Rows.Add()
currentRow = MyReader.ReadFields()
Dgvs(0, rd).Value = boatClass
Dgvs(1, rd).Value = currentRow(1)
Dgvs(2, rd).Value = currentRow(2)
Dgvs(3, rd).Value = currentRow(3)
Dgvs(4, rd).Value = currentRow(4)
Dgvs(5, rd).Value = currentRow(5)
rd += 1
Catch ex As Exception
End Try
End While
End Using
Using WriteFile As New StreamWriter(myFile)
For x As Integer = 0 To Dgvs.Rows.Count - 1
For y As Integer = 0 To Dgvs.Columns.Count - 1
WriteFile.Write(Dgvs.Rows(x).Cells(y).Value)
If y <> Dgvs.Columns.Count - 1 Then
WriteFile.Write(", ")
End If
Next
WriteFile.WriteLine()
Next
End Using
I need a CSV file for output.
Instead of storing values in a DatGridView, you could store them in a List(Of String), where each string in the list is a line of the output csv file.
Dim output As New List(Of String)
While Not MyReader.EndOfData
Try
currentRow = MyReader.ReadFields()
Dim line As String
line = boatClass
line = line & "," & currentRow(1).ToString
line = line & "," & currentRow(2).ToString
line = line & "," & currentRow(3).ToString
line = line & "," & currentRow(4).ToString
line = line & "," & currentRow(5).ToString
output.Add(line)
Catch ex As Exception
End Try
End While
And then you write output lines as follows.
Using WriteFile As New StreamWriter(myFile)
For Each line As String In output
WriteFile.Write(line)
Next
End Using

Working with commas in a comma delimited file

I have a vb project that imports a csv file and some of the data contains commas. The fields with the commas are in double quotes.
I am creating a datagridview from the header row of the csv then importing the remainder of the file into the dgv but the fields with commas are causing a problem. The fields are not fixed width.
I think I need a way to qualify the commas as a delimiter based on double quote or some other method of importing the data into the dgv.
Thanks
Using objReader As New StreamReader(FName)
Dim line As String = objReader.ReadLine()
Do While objReader.Peek() <> -1
line = objReader.ReadLine()
Dim splitLine() As String = line.Split(",")
DataGridView1.Rows.Add(splitLine)
Application.DoEvents()
Loop
End Using
Example Data:
1,"VALIDFLAG, NOGPS",0,1.34,3.40,0.17,1
Thinks very much for the suggestions.
I am going to use textfieldparser for my import.
Using MyReader As New Microsoft.VisualBasic.FileIO.TextFieldParser(FName)
MyReader.TextFieldType = Microsoft.VisualBasic.FileIO.FieldType.Delimited
MyReader.Delimiters = New String() {","}
Dim currentRow As String()
Dim firstline As Boolean = True
While Not MyReader.EndOfData
Try
currentRow = MyReader.ReadFields()
If firstline = True Then
firstline = False
Else
Me.DataGridView1.Rows.Add(currentRow)
End If
Catch ex As Microsoft.VisualBasic.FileIO.MalformedLineException
MsgBox("Line " & ex.Message & " is invalid. Skipping")
End Try
Application.DoEvents()
End While
End Using

I used FSO to get a file name. But I need the path too. Is it possible to get the path using FSO?

Here is the code I'm using. Hope that it helps pinpoint the area where I need help.
Private Sub readCVS_file()
Using MyReader As New Microsoft.VisualBasic.
FileIO.TextFieldParser(
GetTempPath() + "output.tmp")
MyReader.TextFieldType = FileIO.FieldType.Delimited
MyReader.SetDelimiters(",")
Dim currentRow As String()
While Not MyReader.EndOfData
Try
currentRow = MyReader.ReadFields()
Dim currentField As String
For Each currentField In currentRow
'get file path using currentField. First field is program name
Next
Catch ex As Microsoft.VisualBasic.
FileIO.MalformedLineException
MsgBox("Line " & ex.Message &
"is not valid.")
End Try
End While
End Using
End Sub
After reading the current field, I want to get the path to the filename (filename is the first field in output.tmp text file.)
Thanks for any help...

VB.NET read only some cells for each row from csv file

I have a csv file that has the following structure :
id,name,adress,email,age
1,john,1 str xxxx,john#gmail.com,19
2,mike,2 bd xxxx,mike#gmail.com,21
3,jeana,1 str ssss,jeana#gmail.com,18
.......................
.......................
What I would like to do is to read the csv file, skip the first line (contains headers) and extract the 2nd, 3rd and 4th data from each row and populate a datagridview.
This is the code I'm using however it brings me all the csv content :
DataGridView1.ColumnCount = 4
DataGridView1.Columns(0).Name = "ID"
DataGridView1.Columns(1).Name = "NAME"
DataGridView1.Columns(2).Name = "ADRESS"
DataGridView1.Columns(3).Name = "AGE"
Using MyReader As New Microsoft.VisualBasic.FileIO.TextFieldParser _
(openFile.FileName)//the csv path
'Specify that reading from a comma-delimited file'
MyReader.TextFieldType = FileIO.FieldType.Delimited
MyReader.SetDelimiters(",")
Dim currentRow As String()
While Not MyReader.EndOfData
Try
currentRow = MyReader.ReadFields()
With DataGridView1.Rows.Add(currentRow) 'Add new row to data gridview'
End With
Catch ex As Microsoft.VisualBasic.FileIO.MalformedLineException
MsgBox("Line " & ex.Message & "is not valid and will be skipped.")
End Try
End While
End Using
So can someone show me how to do that?
Thanks.
It could be simple as reading the first line and discard it, Then start to read the real data from your file
Using MyReader As New TextFieldParser(openFile.FileName)
MyReader.TextFieldType = FileIO.FieldType.Delimited
MyReader.SetDelimiters(",")
Dim currentRow As String()
' Read the first line and do nothing with it
If Not MyReader.EndOfData Then
currentRow = MyReader.ReadFields()
End If
While Not MyReader.EndOfData
Try
' Read again the file
currentRow = MyReader.ReadFields()
DataGridView1.Rows.Add(currentRow(1), currentRow(2),currentRow(3))
Catch ex As Microsoft.VisualBasic.FileIO.MalformedLineException
MsgBox("Line " & ex.Message & "is not valid and will be skipped.")
End Try
End While
End Using
EDIT Seeing your comment below then I have changed the line that add the row to add only the strings at position 1,2 and 3. This of course is different from the columns added to the DataGridView. It is not clear if you want to change these columns to contains only these 3 fields. If you still want the column for ID and AGE in the grid you could change the Add to
DataGridView1.Rows.Add("", currentRow(1), currentRow(2),currentRow(3), "")