Parse fixed-width .csv file - vb.net

i have .csv file which i need to parse. Its not delimted but fixed-with, ca you tell me what is the best way to parse such can of fail. This is the sample:
Object Name IP Address Name NE ID NE Type/Release Partition Access Profile Supervision State
MS-POLT01 10.45.3.11 MS-POLT01 1 7302 ISAM IHUB R4.3 defaultPAP Supervised
TPO-POLT02 10.34.1.33/10.74..61 TPO-POLT02 10 7302 ISAM IHUB R4.3 defaultPAP Supervised
WPU-POLT02 10.70.8.21 WPU-POLT02 100 7302 ISAM IHUB R4.3 defaultPAP Supervised
MOV-POLT01 10.70.2.45 MOV-POLT01 101 7302 ISAM IHUB R4.3 defaultPAP Supervised
Results of 'EROS': 6 records found. Duration 0 s.
This query was executed by john
EDIT - for further discussions:
Sub Main()
Using MyReader As New Microsoft.VisualBasic.FileIO.
TextFieldParser("file.csv")
MyReader.TextFieldType =
Microsoft.VisualBasic.FileIO.FieldType.Delimited
MyReader.Delimiters = New String() {vbTab}
Dim currentRow As String()
'Loop through all of the fields in the file.
'If any lines are corrupt, report an error and continue parsing.
While Not MyReader.EndOfData
Try
currentRow = MyReader.ReadFields()
' Debug.Print(String.Join(",", currentRow))
For Each currentField In currentRow
Debug.Print(currentField)
Next
' Include code here to handle the row.
Catch ex As Microsoft.VisualBasic.FileIO.MalformedLineException
Console.WriteLine("Line " & ex.Message & " is invalid. Skipping")
End Try
End While
End Using
Console.ReadLine()
End Sub

Use the TextFieldParser-class, it was exactly developed for this purpose:
MSDN Example:
Using Reader As New Microsoft.VisualBasic.FileIO.TextFieldParser("C:\TestFolder\test.log")
Reader.TextFieldType = Microsoft.VisualBasic.FileIO.FieldType.FixedWidth
Reader.SetFieldWidths(5, 10, 11, -1)
Dim currentRow As String()
While Not Reader.EndOfData
Try
currentRow = Reader.ReadFields()
Dim currentField As String
For Each currentField In currentRow
MsgBox(currentField)
Next
Catch ex As Microsoft.VisualBasic.FileIO.MalformedLineException
MsgBox("Line " & ex.Message &
"is not valid and will be skipped.")
End Try
End While
End Using

Related

How to Parse Text File delimited with ~ into a Datatable or Array in VB.Net

I'm trying to parse the text file into an array or datatable where Lines Starting with D, O, and L are a single Row of data.
There are never more than 1 "L" Lines.
I want to get this into a datatable or 2-dimensional array where the column header names (locations) are
Date {D3}
Customer Name {O2}
Address {O3}
City {O7}
State {O8}
Zipcode {O9}
Reference ID {D17}
Amount {D20}
I tried
TextFieldParser("C:\Users\MyAccount\test.txt")
FileReader.SetDelimiters("~")
But I'm not understanding how to work with the output. Any ideas?
B~AAA~~12/03/19~12/03/19~1~428.51~APV~REF~K8~~
D~AAA~~12/03/19~12/03/19~APV~REF~N~REFUNDCIS~~12/03/19~0~N~N~Y~~~0000244909~~~72.90~~~00~N~0~12/03/19~0~12/03/19~12/03/19~0~K8~~~N~N~0~
O~JOHN DOE~ 1000 NOAKY LN ~~~~DETROIT~MI~31000~~~
L~01~141011~000~00000~000~00~000~~REFUND0000244909JOHN DOE~72.90~N~N~~~N~
D~AAA~~12/03/19~12/03/19~APV~REF~N~REFUNDCIS~~12/03/19~0~N~N~Y~~~0000404236~~~101.42~~~00~N~0~12/03/19~0~12/03/19~12/03/19~0~K8~~~N~N~0~
O~BRUCE DOE~UNIT 1 1000 E MICHIGAN AVE ~~~~DETROIT~MI~31000~~~
L~01~141011~000~00000~000~00~000~~REFUND0000404236BRUCE DOE~101.42~N~N~~~N~
D~AAA~~12/03/19~12/03/19~APV~REF~N~REFUNDCIS~~12/03/19~0~N~N~Y~~~0000436750~~~180.00~~~00~N~0~12/03/19~0~12/03/19~12/03/19~0~K8~~~N~N~0~
O~JOEL DOE~ 100 MICHIGAN AVE ~~~~DETROIT~MI~31000~~~
L~01~141011~000~00000~000~00~000~~REFUND0000436750JOEL DOE~180.00~N~N~~~N~
D~AAA~~12/03/19~12/03/19~APV~REF~N~REFUNDCIS~~12/03/19~0~N~N~Y~~~0000448122~~~74.19~~~00~N~0~12/03/19~0~12/03/19~12/03/19~0~K8~~~N~N~0~
O~JOHN DOE~ 100 MICHIGAN AVE ~~~~DETROIT~MI~31000~~~
L~01~141011~000~00000~000~00~000~~REFUND0000448122JOHN DOE~74.19~N~N~~~N~
First I took the code from MS docs https://learn.microsoft.com/en-us/dotnet/api/microsoft.visualbasic.fileio.textfieldparser?view=netcore-3.1
I needed to know how many columns I needed in the datatable.
Private Sub OpCode1()
Dim maxColumnCount As Integer
Using MyReader As New TextFieldParser("C:\Users\xxx\test.txt")
MyReader.TextFieldType = FieldType.Delimited
MyReader.Delimiters = {"~"}
Dim currentRow As String()
'Loop through all of the fields in the file.
'If any lines are corrupt, report an error and continue parsing.
While Not MyReader.EndOfData
Try
currentRow = MyReader.ReadFields()
' Include code here to handle the row.
If currentRow.Count > maxColumnCount Then
maxColumnCount = currentRow.Count
End If
Catch ex As Microsoft.VisualBasic.FileIO.MalformedLineException
MsgBox("Line " & ex.Message & " is invalid. Skipping")
End Try
End While
End Using
MessageBox.Show(maxColumnCount.ToString)
End Sub
Once I had the number of columns needed, I created a DataTable and added the necessary number of columns. Then, where the Example instructed you to handle the row, I added the row to the DataTable. Lastly, I displayed the DataTable in a DataGridView.
Private Sub OPCode()
Dim dt As New DataTable
For i = 1 To 38
dt.Columns.Add(i.ToString)
Next
Using MyReader As New TextFieldParser("C:\Users\xxx\test.txt")
MyReader.TextFieldType = FieldType.Delimited
MyReader.Delimiters = {"~"}
Dim currentRow As String()
'Loop through all of the fields in the file.
'If any lines are corrupt, report an error and continue parsing.
While Not MyReader.EndOfData
Try
currentRow = MyReader.ReadFields()
' Include code here to handle the row.
dt.Rows.Add(currentRow)
Catch ex As Microsoft.VisualBasic.FileIO.MalformedLineException
MsgBox("Line " & ex.Message & " is invalid. Skipping")
End Try
End While
End Using
DataGridView1.DataSource = dt
End Sub
More specific to your data...
Private Sub OpCode()
Dim dt As New DataTable
dt.Columns.Add("Date", GetType(Date))
dt.Columns.Add("Customer Name", GetType(String))
dt.Columns.Add("Address", GetType(String))
dt.Columns.Add("City", GetType(String))
dt.Columns.Add("State", GetType(String))
dt.Columns.Add("Zipcode", GetType(String))
dt.Columns.Add("RefID", GetType(String))
dt.Columns.Add("Amount", GetType(Decimal))
Dim DataDate As Date
Dim RefID As String = ""
Dim Amount As Decimal
Using MyReader As New TextFieldParser("C:\Users\maryo\test.txt")
MyReader.TextFieldType = FieldType.Delimited
MyReader.Delimiters = {"~"}
Dim currentRow As String()
'Loop through all of the fields in the file.
'If any lines are corrupt, report an error and continue parsing.
While Not MyReader.EndOfData
Try
currentRow = MyReader.ReadFields()
' Include code here to handle the row.
If currentRow(0) = "D" Then
DataDate = CDate(currentRow(3))
RefID = currentRow(17)
Amount = CDec(currentRow(20))
End If
If currentRow(0) = "O" Then
dt.Rows.Add({DataDate, currentRow(1), currentRow(2), currentRow(6), currentRow(7), currentRow(8), RefID, Amount})
End If
Catch ex As Microsoft.VisualBasic.FileIO.MalformedLineException
MsgBox("Line " & ex.Message & " is invalid. Skipping")
End Try
End While
End Using
DataGridView1.DataSource = dt
End Sub
I assumed that the D row applies to the following O row. I saved the data from the D row in the 3 variables and used it when the next row is read.
Remember collections (including arrays) are zero based in .net.

I used FSO to get a file name. But I need the path too. Is it possible to get the path using FSO?

Here is the code I'm using. Hope that it helps pinpoint the area where I need help.
Private Sub readCVS_file()
Using MyReader As New Microsoft.VisualBasic.
FileIO.TextFieldParser(
GetTempPath() + "output.tmp")
MyReader.TextFieldType = FileIO.FieldType.Delimited
MyReader.SetDelimiters(",")
Dim currentRow As String()
While Not MyReader.EndOfData
Try
currentRow = MyReader.ReadFields()
Dim currentField As String
For Each currentField In currentRow
'get file path using currentField. First field is program name
Next
Catch ex As Microsoft.VisualBasic.
FileIO.MalformedLineException
MsgBox("Line " & ex.Message &
"is not valid.")
End Try
End While
End Using
End Sub
After reading the current field, I want to get the path to the filename (filename is the first field in output.tmp text file.)
Thanks for any help...

Extract field lengths

I want to read fixed length files.
I know how to do this if I know the field lengths.
Using Reader As New Microsoft.VisualBasic.FileIO.TextFieldParser(filePath)
Reader.TextFieldType =
Microsoft.VisualBasic.FileIO.FieldType.FixedWidth
Reader.SetFieldWidths(8, 16, 16, 12, 14, 16) 'They are different in each file
Dim currentRow As String()
While Not Reader.EndOfData
Try
currentRow = Reader.ReadFields()
Dim currentField As String
For Each currentField In currentRow
MsgBox(currentField)
Next
Catch ex As Microsoft.VisualBasic.
FileIO.MalformedLineException
MsgBox("Line " & ex.Message &
"is not valid and will be skipped.")
End Try
End While
End Using
The problem is that I don't know the length of each field.
Is there a way to read the first line and get the Field lengths?
Since I have found the answer, I'll post it here in case someone will face the same problem.
I solved it using regex
'sRowData is the first line of the file
Dim pArLengths() As Integer = Nothing 'Array to store the lengths
Dim regex As New Regex("[^\W_\d]+", _
RegexOptions.IgnoreCase _
Or RegexOptions.Multiline _
Or RegexOptions.Singleline _
Or RegexOptions.IgnorePatternWhitespace)
Dim myMatches As MatchCollection = regex.Matches(sRowData)
ReDim pArLengths(myMatches.Count - 1)
For i = 0 To myMatches.Count - 1
Dim k As Integer
k = If(i < myMatches.Count - 1, myMatches(i + 1).Index, sRowData.Length)
pArLengths(i) = k - myMatches(i).Index
Next
I hope that someone will find it useful.

VB.NET read only some cells for each row from csv file

I have a csv file that has the following structure :
id,name,adress,email,age
1,john,1 str xxxx,john#gmail.com,19
2,mike,2 bd xxxx,mike#gmail.com,21
3,jeana,1 str ssss,jeana#gmail.com,18
.......................
.......................
What I would like to do is to read the csv file, skip the first line (contains headers) and extract the 2nd, 3rd and 4th data from each row and populate a datagridview.
This is the code I'm using however it brings me all the csv content :
DataGridView1.ColumnCount = 4
DataGridView1.Columns(0).Name = "ID"
DataGridView1.Columns(1).Name = "NAME"
DataGridView1.Columns(2).Name = "ADRESS"
DataGridView1.Columns(3).Name = "AGE"
Using MyReader As New Microsoft.VisualBasic.FileIO.TextFieldParser _
(openFile.FileName)//the csv path
'Specify that reading from a comma-delimited file'
MyReader.TextFieldType = FileIO.FieldType.Delimited
MyReader.SetDelimiters(",")
Dim currentRow As String()
While Not MyReader.EndOfData
Try
currentRow = MyReader.ReadFields()
With DataGridView1.Rows.Add(currentRow) 'Add new row to data gridview'
End With
Catch ex As Microsoft.VisualBasic.FileIO.MalformedLineException
MsgBox("Line " & ex.Message & "is not valid and will be skipped.")
End Try
End While
End Using
So can someone show me how to do that?
Thanks.
It could be simple as reading the first line and discard it, Then start to read the real data from your file
Using MyReader As New TextFieldParser(openFile.FileName)
MyReader.TextFieldType = FileIO.FieldType.Delimited
MyReader.SetDelimiters(",")
Dim currentRow As String()
' Read the first line and do nothing with it
If Not MyReader.EndOfData Then
currentRow = MyReader.ReadFields()
End If
While Not MyReader.EndOfData
Try
' Read again the file
currentRow = MyReader.ReadFields()
DataGridView1.Rows.Add(currentRow(1), currentRow(2),currentRow(3))
Catch ex As Microsoft.VisualBasic.FileIO.MalformedLineException
MsgBox("Line " & ex.Message & "is not valid and will be skipped.")
End Try
End While
End Using
EDIT Seeing your comment below then I have changed the line that add the row to add only the strings at position 1,2 and 3. This of course is different from the columns added to the DataGridView. It is not clear if you want to change these columns to contains only these 3 fields. If you still want the column for ID and AGE in the grid you could change the Add to
DataGridView1.Rows.Add("", currentRow(1), currentRow(2),currentRow(3), "")

Read only particular fields from CSV File in vb.net

I have this code to read a CVS file. It reads each line, devides each line by delimiter ',' and stored the field values in array 'strline()' .
How do I extract only required fields from the CSV file?
For example if I have a CSV File like
Type,Group,No,Sequence No,Row No,Date (newline)
0,Admin,3,345678,1,26052010 (newline)
1,Staff,5,78654,3,26052010
I Need only the value of columns Group,Sequence No and date.
Thanks in advance for any ideas.
Dim myStream As StreamReader = Nothing
' Hold the Parsed Data
Dim strlines() As String
Dim strline() As String
Try
myStream = File.OpenText(OpenFile.FileName)
If (myStream IsNot Nothing) Then
' Hold the amount of lines already read in a 'counter-variable'
Dim placeholder As Integer = 0
strlines = myStream.ReadToEnd().Split(Environment.NewLine)
Do While strlines.Length <> -1 ' Is -1 when no data exists on the next line of the CSV file
strline = strlines(placeholder).Split(",")
placeholder += 1
Loop
End If
Catch ex As Exception
LogErrorException(ex)
Finally
If (myStream IsNot Nothing) Then
myStream.Close()
End If
End Try
1) DO NOT USE String.Split!!
CSV data can contain comma's, e.g.
id,name
1,foo
2,"hans, bar"
Also as above you would need to handle quoted fields etc... See CSV Info for more details.
2) Check out TextFieldParser - it hadles all this sort of thing.
It will handle the myriad of different escapes you can't do with string.split...
Sample from: http://msdn.microsoft.com/en-us/library/cakac7e6.aspx
Using MyReader As New Microsoft.VisualBasic.FileIO.TextFieldParser("C:\TestFolder\test.txt")
MyReader.TextFieldType = FileIO.FieldType.Delimited
MyReader.SetDelimiters(",")
Dim currentRow As String()
While Not MyReader.EndOfData
Try
currentRow = MyReader.ReadFields()
Dim currentField As String
For Each currentField In currentRow
MsgBox(currentField)
Next
Catch ex As Microsoft.VisualBasic.FileIO.MalformedLineException
MsgBox("Line " & ex.Message & "is not valid and will be skipped.")
End Try
End While
End Using
The MyReader.ReadFields() part will get you an array of strings, from there you'll need to use the index etc...
PK :-)
Maybe instead of only importing selected fields, you should import everything, then only use the ones you need.