How to eliminate some rows and columns from a CSV file and save to new CSV? - vb.net

I have a csv file, where the first 3 rows have unwanted data. The 4th row has needed data in the first column only. There are 4 more rows with unwanted data. Rows 9 through the end have needed data. Starting with row 9 there are 11 columns of data, columns 1 through 6 are needed, columns 7 through 11 are unwanted.
I have code that uses a DataGridView for temporary storage. It provides the parsing described above, however I don't need to view the data, I need to create a new CSV file resulting from the parsing.
There is probably a method using a data table for temporary storage, instead of the DataGridView, however maybe there is a simpler way using LINQ. I have no experience with LINQ and my experience with data tables is very limited. I am very comfortable with DataGridView since I use it extensively, but as I wrote earlier I don't need to display the result.
I tried the code in: https://www.codeproject.com/questions/634373/how-to-delete-the-rows-in-csv-file. But it doesn't fit my situation. The code below works using a DataGridView for temporary storage but I am sure there is a better way.
Using MyReader As New TextFieldParser(racerFile)
Dim currentRow As String()
MyReader.TextFieldType = Microsoft.VisualBasic.FileIO.FieldType.Delimited
MyReader.Delimiters = New String() {","}
currentRow = MyReader.ReadFields()
currentRow = MyReader.ReadFields()
currentRow = MyReader.ReadFields()
boatClass = MyReader.ReadFields()(0)
currentRow = MyReader.ReadFields()
currentRow = MyReader.ReadFields()
currentRow = MyReader.ReadFields()
currentRow = MyReader.ReadFields()
While Not MyReader.EndOfData
Try
Dgvs.Rows.Add()
currentRow = MyReader.ReadFields()
Dgvs(0, rd).Value = boatClass
Dgvs(1, rd).Value = currentRow(1)
Dgvs(2, rd).Value = currentRow(2)
Dgvs(3, rd).Value = currentRow(3)
Dgvs(4, rd).Value = currentRow(4)
Dgvs(5, rd).Value = currentRow(5)
rd += 1
Catch ex As Exception
End Try
End While
End Using
Using WriteFile As New StreamWriter(myFile)
For x As Integer = 0 To Dgvs.Rows.Count - 1
For y As Integer = 0 To Dgvs.Columns.Count - 1
WriteFile.Write(Dgvs.Rows(x).Cells(y).Value)
If y <> Dgvs.Columns.Count - 1 Then
WriteFile.Write(", ")
End If
Next
WriteFile.WriteLine()
Next
End Using
I need a CSV file for output.

Instead of storing values in a DatGridView, you could store them in a List(Of String), where each string in the list is a line of the output csv file.
Dim output As New List(Of String)
While Not MyReader.EndOfData
Try
currentRow = MyReader.ReadFields()
Dim line As String
line = boatClass
line = line & "," & currentRow(1).ToString
line = line & "," & currentRow(2).ToString
line = line & "," & currentRow(3).ToString
line = line & "," & currentRow(4).ToString
line = line & "," & currentRow(5).ToString
output.Add(line)
Catch ex As Exception
End Try
End While
And then you write output lines as follows.
Using WriteFile As New StreamWriter(myFile)
For Each line As String In output
WriteFile.Write(line)
Next
End Using

Related

Skip the first line of the CSV file (Headers) Visual Basic

Like many on here, I am new to programming and mainly focus on web development. I have written a program cobbled together from help on here that works perfectly. I take a CSV file and inject it into an SQL database. I am getting a "MalformedLineException" line exception on the last line of the CSV file and believe it is because the header line is not being skipped.
Would love some help on working out how to skip the first line from my code below:
Private Sub subProcessFile(ByVal strFileName As String)
'This is the file location for the CSV File
Using TextFileReader As New Microsoft.VisualBasic.FileIO.TextFieldParser(strFileName)
'removing the delimiter
TextFileReader.TextFieldType = Microsoft.VisualBasic.FileIO.FieldType.Delimited
TextFileReader.SetDelimiters(",")
ProgressBar1.Value = 0
Application.DoEvents()
'variables
Dim TextFileTable As DataTable = Nothing
Dim Column As DataColumn
Dim Row As DataRow
Dim UpperBound As Int32
Dim ColumnCount As Int32
Dim CurrentRow As String()
'Loop To read in data from CSV
While Not TextFileReader.EndOfData
Try
CurrentRow = TextFileReader.ReadFields()
If Not CurrentRow Is Nothing Then
''# Check if DataTable has been created
If TextFileTable Is Nothing Then
TextFileTable = New DataTable("TextFileTable")
''# Get number of columns
UpperBound = CurrentRow.GetUpperBound(0)
''# Create new DataTable
For ColumnCount = 0 To UpperBound
Column = New DataColumn()
Column.DataType = System.Type.GetType("System.String")
Column.ColumnName = "Column" & ColumnCount
Column.Caption = "Column" & ColumnCount
Column.ReadOnly = True
Column.Unique = False
TextFileTable.Columns.Add(Column)
ProgressBar1.Value = 25
Application.DoEvents()
Next
clsDeletePipeLineData.main()
End If
Row = TextFileTable.NewRow
'Dim Rownum As Double = Row
'If Rownum >= 1715 Then
' MsgBox(Row)
'End If
For ColumnCount = 0 To UpperBound
Row("Column" & ColumnCount) = CurrentRow(ColumnCount).ToString
Next
TextFileTable.Rows.Add(Row)
clsInsertPipeLineData.main(CurrentRow(0).ToString, CurrentRow(1).ToString, CurrentRow(2).ToString, CurrentRow(3).ToString, CurrentRow(4).ToString, CurrentRow(5).ToString, CurrentRow(6).ToString, CurrentRow(7).ToString, CurrentRow(9).ToString)
ProgressBar1.Value = 50
Application.DoEvents()
End If
Catch ex As _
Microsoft.VisualBasic.FileIO.MalformedLineException
MsgBox("Line " & ex.Message &
"is not valid and will be skipped.")
End Try
End While
ProgressBar1.Value = 100
Application.DoEvents()
clsMailConfirmation.main()
TextFileReader.Dispose()
MessageBox.Show("The process has been completed successfully")
End Using
"MalformedLineException" says that Line cannot be parsed using the current Delimiters, to fix it, adjust Delimiters so the line can be parsed correctly, or insert exception-handling code in order to handle the line.
Someone encountered similar question, maybe its reply can help you.

Split in VB.net

FASTER,WW0011,"CTR ,REURN,ALT TUBING HELIUM LEAK",DEFAULT test,1,3.81,test
I need to get the result of the following line as
Arr(0) =faster
Arr(1) =WW0011
Arr(2) =CTR ,REURN,ALT TUBING HELIUM LEAK
Arr(3) =DEFAULT test
Arr(4) =faster
Arr(5) = 1
Arr(6)=3.81
Arr(7) = test
I tried using split, but the problem is on Arr(2)
could anyone please give me a solution
You could use the TextFieldParser class which will take care of situations like this. Set the HasFieldEnclosedInQuotes property to true. Here is an example from MSDN (slightly altered):
Using MyReader As New Microsoft.VisualBasic.FileIO.TextFieldParser("c:\logs\bigfile")
MyReader.TextFieldType = Microsoft.VisualBasic.FileIO.FieldType.Delimited
MyReader.Delimiters = New String() {","}
'Set this to ignore commas in quoted fields.
MyReader.HasFieldsEnclosedInQuotes = True
Dim currentRow As String()
'Loop through all of the fields in the file.
'If any lines are corrupt, report an error and continue parsing.
While Not MyReader.EndOfData
Try
currentRow = MyReader.ReadFields()
' Include code here to handle the row.
Catch ex As Microsoft.VisualBasic.FileIO.MalformedLineException
MsgBox("Line " & ex.Message & " is invalid. Skipping")
End Try
End While
End Using
I use this function alot myself
Private Function splitQuoted(ByVal line As String, ByVal delimeter As Char) As String()
Dim list As New List(Of String)
Do While line.IndexOf(delimeter) <> -1
If line.StartsWith("""") Then
line = line.Substring(1)
Dim idx As Integer = line.IndexOf("""")
While line.IndexOf("""", idx) = line.IndexOf("""""", idx)
idx = line.IndexOf("""""", idx) + 2
End While
idx = line.IndexOf("""", idx)
list.Add(line.Substring(0, idx))
line = line.Substring(idx + 2)
Else
list.Add(line.Substring(0, Math.Max(line.IndexOf(delimeter), 0)))
line = line.Substring(line.IndexOf(delimeter) + 1)
End If
Loop
list.Add(line)
Return list.ToArray
End Function
Use a for loop to iterate the string char by char!

VB.NET read only some cells for each row from csv file

I have a csv file that has the following structure :
id,name,adress,email,age
1,john,1 str xxxx,john#gmail.com,19
2,mike,2 bd xxxx,mike#gmail.com,21
3,jeana,1 str ssss,jeana#gmail.com,18
.......................
.......................
What I would like to do is to read the csv file, skip the first line (contains headers) and extract the 2nd, 3rd and 4th data from each row and populate a datagridview.
This is the code I'm using however it brings me all the csv content :
DataGridView1.ColumnCount = 4
DataGridView1.Columns(0).Name = "ID"
DataGridView1.Columns(1).Name = "NAME"
DataGridView1.Columns(2).Name = "ADRESS"
DataGridView1.Columns(3).Name = "AGE"
Using MyReader As New Microsoft.VisualBasic.FileIO.TextFieldParser _
(openFile.FileName)//the csv path
'Specify that reading from a comma-delimited file'
MyReader.TextFieldType = FileIO.FieldType.Delimited
MyReader.SetDelimiters(",")
Dim currentRow As String()
While Not MyReader.EndOfData
Try
currentRow = MyReader.ReadFields()
With DataGridView1.Rows.Add(currentRow) 'Add new row to data gridview'
End With
Catch ex As Microsoft.VisualBasic.FileIO.MalformedLineException
MsgBox("Line " & ex.Message & "is not valid and will be skipped.")
End Try
End While
End Using
So can someone show me how to do that?
Thanks.
It could be simple as reading the first line and discard it, Then start to read the real data from your file
Using MyReader As New TextFieldParser(openFile.FileName)
MyReader.TextFieldType = FileIO.FieldType.Delimited
MyReader.SetDelimiters(",")
Dim currentRow As String()
' Read the first line and do nothing with it
If Not MyReader.EndOfData Then
currentRow = MyReader.ReadFields()
End If
While Not MyReader.EndOfData
Try
' Read again the file
currentRow = MyReader.ReadFields()
DataGridView1.Rows.Add(currentRow(1), currentRow(2),currentRow(3))
Catch ex As Microsoft.VisualBasic.FileIO.MalformedLineException
MsgBox("Line " & ex.Message & "is not valid and will be skipped.")
End Try
End While
End Using
EDIT Seeing your comment below then I have changed the line that add the row to add only the strings at position 1,2 and 3. This of course is different from the columns added to the DataGridView. It is not clear if you want to change these columns to contains only these 3 fields. If you still want the column for ID and AGE in the grid you could change the Add to
DataGridView1.Rows.Add("", currentRow(1), currentRow(2),currentRow(3), "")

Malformed CSV at end

Hey all i am trying to figure out a way of correcting the error in my CSV file before it errors out with a MalformedLineException.
My code is this:
Using myreader As New Microsoft.VisualBasic.FileIO.TextFieldParser("c:\temp.csv")
myreader.TextFieldType = Microsoft.VisualBasic.FileIO.FieldType.Delimited
myreader.Delimiters = New String() {",", "\n"}
myreader.HasFieldsEnclosedInQuotes = True 'Added
While Not myreader.EndOfData
Try
currentrow = myreader.ReadFields()
The error is on the currentrow = myreader.ReadFields(). It's caused by not having the end quote in the last line of the CSV:
"xx.xxx.xxx.xx","2012-05-15 13:15:54","Bob Barker","bbarker#priceisright.com","
It should read:
"xx.xxx.xxx.xx","2012-05-15 13:15:54","Bob Barker","bbarker#priceisright.com",""
How can i correct this before it gets to the line currentrow = myreader.ReadFields()?
You can use File.AppendAllText to add the quote:
File.AppendAllText(filePath, """")

Read only particular fields from CSV File in vb.net

I have this code to read a CVS file. It reads each line, devides each line by delimiter ',' and stored the field values in array 'strline()' .
How do I extract only required fields from the CSV file?
For example if I have a CSV File like
Type,Group,No,Sequence No,Row No,Date (newline)
0,Admin,3,345678,1,26052010 (newline)
1,Staff,5,78654,3,26052010
I Need only the value of columns Group,Sequence No and date.
Thanks in advance for any ideas.
Dim myStream As StreamReader = Nothing
' Hold the Parsed Data
Dim strlines() As String
Dim strline() As String
Try
myStream = File.OpenText(OpenFile.FileName)
If (myStream IsNot Nothing) Then
' Hold the amount of lines already read in a 'counter-variable'
Dim placeholder As Integer = 0
strlines = myStream.ReadToEnd().Split(Environment.NewLine)
Do While strlines.Length <> -1 ' Is -1 when no data exists on the next line of the CSV file
strline = strlines(placeholder).Split(",")
placeholder += 1
Loop
End If
Catch ex As Exception
LogErrorException(ex)
Finally
If (myStream IsNot Nothing) Then
myStream.Close()
End If
End Try
1) DO NOT USE String.Split!!
CSV data can contain comma's, e.g.
id,name
1,foo
2,"hans, bar"
Also as above you would need to handle quoted fields etc... See CSV Info for more details.
2) Check out TextFieldParser - it hadles all this sort of thing.
It will handle the myriad of different escapes you can't do with string.split...
Sample from: http://msdn.microsoft.com/en-us/library/cakac7e6.aspx
Using MyReader As New Microsoft.VisualBasic.FileIO.TextFieldParser("C:\TestFolder\test.txt")
MyReader.TextFieldType = FileIO.FieldType.Delimited
MyReader.SetDelimiters(",")
Dim currentRow As String()
While Not MyReader.EndOfData
Try
currentRow = MyReader.ReadFields()
Dim currentField As String
For Each currentField In currentRow
MsgBox(currentField)
Next
Catch ex As Microsoft.VisualBasic.FileIO.MalformedLineException
MsgBox("Line " & ex.Message & "is not valid and will be skipped.")
End Try
End While
End Using
The MyReader.ReadFields() part will get you an array of strings, from there you'll need to use the index etc...
PK :-)
Maybe instead of only importing selected fields, you should import everything, then only use the ones you need.