Like many on here, I am new to programming and mainly focus on web development. I have written a program cobbled together from help on here that works perfectly. I take a CSV file and inject it into an SQL database. I am getting a "MalformedLineException" line exception on the last line of the CSV file and believe it is because the header line is not being skipped.
Would love some help on working out how to skip the first line from my code below:
Private Sub subProcessFile(ByVal strFileName As String)
'This is the file location for the CSV File
Using TextFileReader As New Microsoft.VisualBasic.FileIO.TextFieldParser(strFileName)
'removing the delimiter
TextFileReader.TextFieldType = Microsoft.VisualBasic.FileIO.FieldType.Delimited
TextFileReader.SetDelimiters(",")
ProgressBar1.Value = 0
Application.DoEvents()
'variables
Dim TextFileTable As DataTable = Nothing
Dim Column As DataColumn
Dim Row As DataRow
Dim UpperBound As Int32
Dim ColumnCount As Int32
Dim CurrentRow As String()
'Loop To read in data from CSV
While Not TextFileReader.EndOfData
Try
CurrentRow = TextFileReader.ReadFields()
If Not CurrentRow Is Nothing Then
''# Check if DataTable has been created
If TextFileTable Is Nothing Then
TextFileTable = New DataTable("TextFileTable")
''# Get number of columns
UpperBound = CurrentRow.GetUpperBound(0)
''# Create new DataTable
For ColumnCount = 0 To UpperBound
Column = New DataColumn()
Column.DataType = System.Type.GetType("System.String")
Column.ColumnName = "Column" & ColumnCount
Column.Caption = "Column" & ColumnCount
Column.ReadOnly = True
Column.Unique = False
TextFileTable.Columns.Add(Column)
ProgressBar1.Value = 25
Application.DoEvents()
Next
clsDeletePipeLineData.main()
End If
Row = TextFileTable.NewRow
'Dim Rownum As Double = Row
'If Rownum >= 1715 Then
' MsgBox(Row)
'End If
For ColumnCount = 0 To UpperBound
Row("Column" & ColumnCount) = CurrentRow(ColumnCount).ToString
Next
TextFileTable.Rows.Add(Row)
clsInsertPipeLineData.main(CurrentRow(0).ToString, CurrentRow(1).ToString, CurrentRow(2).ToString, CurrentRow(3).ToString, CurrentRow(4).ToString, CurrentRow(5).ToString, CurrentRow(6).ToString, CurrentRow(7).ToString, CurrentRow(9).ToString)
ProgressBar1.Value = 50
Application.DoEvents()
End If
Catch ex As _
Microsoft.VisualBasic.FileIO.MalformedLineException
MsgBox("Line " & ex.Message &
"is not valid and will be skipped.")
End Try
End While
ProgressBar1.Value = 100
Application.DoEvents()
clsMailConfirmation.main()
TextFileReader.Dispose()
MessageBox.Show("The process has been completed successfully")
End Using
"MalformedLineException" says that Line cannot be parsed using the current Delimiters, to fix it, adjust Delimiters so the line can be parsed correctly, or insert exception-handling code in order to handle the line.
Someone encountered similar question, maybe its reply can help you.
How do I make this code for a multiline Textbox?
AllNumbers1.AddRange(CType(TabControl2.TabPages(2).Controls("txtIntDraw" & x), TextBox).Text.Split(CChar(",")))
This code wants to be transformed, txtIntDraw.Lines (i).
That's all the Code:
Try
'Throw everything into a list of String initially.
Dim AllNumbers1 As New List(Of String)
'Loop through each TextBox, splitting them by commas
For x = 1 To Val(txtXCount.Text)
AllNumbers1.AddRange(CType(TabControl2.TabPages(2).Controls("txtIntDraw" & x), TextBox).Text.Split(CChar(",")))
Next
'Remove non-integer entries.
AllNumbers1.RemoveAll(Function(x) Integer.TryParse(x, New Integer) = False)
'Join the distinct list to an array, then back to comma separated format into wherever you want it output.
OutputText1.Text = String.Join(",", AllNumbers1.Distinct().ToArray())
Dim part() As String = OutputText1.Text.Split(",")
Dim partCount As Integer = part.Length
TextBox6.Text = partCount
Array1()
Catch ex As Exception
End Try
Is it this simple? Instead of joining the distinct numbers with ",", use Environment.NewLine.
OutputText1.Text = String.Join(Environment.NewLine, AllNumbers1.Distinct())
But your code can be simplified into a method
Private Sub doStuff(delimitersIn As String(), delimiterOut As String)
Dim allNumbers As New List(Of Integer)()
For x = 1 To CInt(txtXCount.Text)
allNumbers.AddRange(TabControl2.TabPages(2).Controls("txtIntDraw" & x).Text.Split(delimitersIn, StringSplitOptions.RemoveEmptyEntries).Where(Function(s) Integer.TryParse(s, New Integer)).Select(Function(s) CInt(s)))
Next
Dim distinctNumbers = allNumbers.Distinct()
OutputText1.Text = String.Join(delimiterOut, distinctNumbers)
TextBox6.Text = distinctNumbers.Count().ToString()
End Sub
Call it with both delimiters
doStuff({Environment.NewLine, ","}, ",")
Or just the newline
doStuff({Environment.NewLine}, ",")
I am reading csv file with oledb mechanism. My main issue is that the string values inside csv while reading are being trimmed (both: at the beggining and and the end with white spaces). I have some specific data in csv file which needs to have such white spaces in only some cases - that's why i cannot handle that after being processed. It has to be done with the convertion.
Unfortunatelly it has to be done with oledb and vb.net as our complex mechanism is based on those technologies.
Is that possible to find a hack or workaround that oledb will not trim my strings?
Below is my code, actual results and expected:
csv file:
Column1|Column2|Column3|Column4
Text1 | Text2| Text3 |Text4
schema.ini
[test.csv]
Format=Delimited(|)
Col1=Column1 Text
Col2=Column2 Text
Col3=Column3 Text
Col4=Column4 Text
Code
Private conn As New OleDbConnection
Private cmd As New OleDbCommand
Private myAccessDataReader As OleDb.OleDbDataReader = Nothing
Sub Main()
Try
Dim dirInfo As String = "C:\csv"
If conn.State = ConnectionState.Open Then
conn.Close()
End If
conn.ConnectionString = "Provider=Microsoft.ACE.OLEDB.12.0; Data Source=" & dirInfo & ";Extended Properties=""Text;HDR=Yes;"";"
conn.Open()
cmd = New OleDbCommand("SELECT * From [test.csv]", conn)
myAccessDataReader = cmd.ExecuteReader()
If myAccessDataReader.HasRows Then
myAccessDataReader.Read()
End If
Console.WriteLine("|" + myAccessDataReader.Item("Column1") + "|")
Console.WriteLine("|" + myAccessDataReader.Item("Column2") + "|")
Console.WriteLine("|" + myAccessDataReader.Item("Column3") + "|")
Console.WriteLine("|" + myAccessDataReader.Item("Column4") + "|")
Console.ReadKey()
Catch ex As Exception
Throw New Exception(ex.Message)
End Try
End Sub
Actual Results:
|Text1|
|Text2|
|Text3|
|Text4|
Expected Results:
|Text1 |
| Text2|
| Text3 |
|Text4|
Ps. I have tried with different settings in schema.ini: encoding, MaxScanRows, fixed width, but nothing helped.
I guess there is a general issue with trailing spaces when dealing with database: some char data types use spaces to fill the rest of the characters. For MSSql there is an option ANSI PADDING which you can turn ON/OFF, but I don't see a way to set that for Microsoft JET Engine which we use for CSV files; we support both oledb and odbc and this issue exists for both.
So, the answer is you can't. Trailing spaces will be always removed when you import data from a CSV data source, no matter if you define text/char/memo data type for your columns (e.g. using schema.ini) or enclose strings into double quotes. You can put some special character (non-space) in the end, after space(s), such as tab, for instance.
microsoft website
Try this out.....but there's no guarantee since I haven't put any error handling....
Function ReadCSVToTable(ByVal Schema As String) As DataTable
Dim file As New StreamReader("C:\dump\" & Schema)
Dim CSVName As String = file.ReadLine()
CSVName = Strings.Mid(CSVName, 2, CSVName.Length - 2)
Dim Delimiter As String = file.ReadLine
Delimiter = Strings.Mid(Delimiter, Strings.InStr(Delimiter, "(") + 1, Delimiter.Length - Strings.InStr(Delimiter, ")") + 1)
Dim Buffer As String = ""
Dim xtable As New DataTable
xtable.TableName = CSVName
'create table
Do
Buffer = file.ReadLine
Dim xCol As New DataColumn
With xCol
.ColumnName = Buffer.Split("=")(0)
.Caption = Buffer.Split("=")(1).Split(" ")(0)
Select Case Buffer.Split("=")(1).Split(" ")(1).ToLower
Case "text"
.DataType = GetType(String)
Case "integer"
.DataType = GetType(Integer)
Case "decimal"
.DataType = GetType(Decimal)
Case "boolean"
.DataType = GetType(Boolean)
Case Else
.DataType = GetType(String)
End Select
End With
xtable.Columns.Add(xCol)
Loop Until file.EndOfStream = True
file.Close()
file.Dispose()
'Fill the table
file = New StreamReader("C:\dump\" & CSVName)
'skip header
Buffer = file.ReadLine
Do
Buffer = file.ReadLine
Dim xCol(xtable.Columns.Count - 1)
Dim xCount As Integer = 0
For Each tCol As DataColumn In xtable.Columns
Select Case tCol.DataType
Case GetType(String)
xCol(xCount) = Convert.ToString(Buffer.Split(New String() {Delimiter}, StringSplitOptions.None)(xCount))
Case GetType(Integer)
xCol(xCount) = Convert.ToInt64(Buffer.Split(New String() {Delimiter}, StringSplitOptions.None)(xCount))
Case GetType(Decimal)
xCol(xCount) = Convert.ToDecimal(Buffer.Split(New String() {Delimiter}, StringSplitOptions.None)(xCount))
Case GetType(Boolean)
xCol(xCount) = Convert.ToBoolean(Buffer.Split(New String() {Delimiter}, StringSplitOptions.None)(xCount))
Case Else
xCol(xCount) = Convert.ToString(Buffer.Split(New String() {Delimiter}, StringSplitOptions.None)(xCount))
End Select
xCount = xCount + 1
Next
xtable.Rows.Add(xCol)
Loop Until file.EndOfStream = True
file.Close()
file.Dispose()
Return xtable
End Function
Private Sub Form1_Load(sender As Object, e As EventArgs) Handles MyBase.Load
Dim CSVTable As DataTable = ReadCSVToTable("schema.ini")
End Sub
I want to read fixed length files.
I know how to do this if I know the field lengths.
Using Reader As New Microsoft.VisualBasic.FileIO.TextFieldParser(filePath)
Reader.TextFieldType =
Microsoft.VisualBasic.FileIO.FieldType.FixedWidth
Reader.SetFieldWidths(8, 16, 16, 12, 14, 16) 'They are different in each file
Dim currentRow As String()
While Not Reader.EndOfData
Try
currentRow = Reader.ReadFields()
Dim currentField As String
For Each currentField In currentRow
MsgBox(currentField)
Next
Catch ex As Microsoft.VisualBasic.
FileIO.MalformedLineException
MsgBox("Line " & ex.Message &
"is not valid and will be skipped.")
End Try
End While
End Using
The problem is that I don't know the length of each field.
Is there a way to read the first line and get the Field lengths?
Since I have found the answer, I'll post it here in case someone will face the same problem.
I solved it using regex
'sRowData is the first line of the file
Dim pArLengths() As Integer = Nothing 'Array to store the lengths
Dim regex As New Regex("[^\W_\d]+", _
RegexOptions.IgnoreCase _
Or RegexOptions.Multiline _
Or RegexOptions.Singleline _
Or RegexOptions.IgnorePatternWhitespace)
Dim myMatches As MatchCollection = regex.Matches(sRowData)
ReDim pArLengths(myMatches.Count - 1)
For i = 0 To myMatches.Count - 1
Dim k As Integer
k = If(i < myMatches.Count - 1, myMatches(i + 1).Index, sRowData.Length)
pArLengths(i) = k - myMatches(i).Index
Next
I hope that someone will find it useful.
I have such kind of data in a text file:
12343,M,Helen Beyer,92149999,21,F,10,F,F,T,T,T,F,F
54326,F,Donna Noble,92148888,19,M,99,T,F,T,F,T,F,T
99999,M,Ed Harrison,92147777,28,F,5,F,F,F,F,F,F,T
88886,F,Amy Pond,92146666,31,M,2,T,F,T,T,T,T,T
37378,F,Martha Jones,92144444,30,M,5,T,F,F,F,T,T,T
22444,M,Tom Scully,92145555,42,F,6,T,T,T,T,T,T,T
81184,F,Sarah Jane Smith,92143333,22,F,5,F,F,F,T,T,T,F
97539,M,Angus Harley,92142222,22,M,9,F,T,F,T,T,T,T
24686,F,Rose Tyler,92142222,22,M,5,F,F,F,T,T,T,F
11113,F,Jo Grant,92142222,22,M,5,F,F,F,T,T,T,F
I want to extract the Initial of the first name and complete surname. So the output should look like:
H. Beyer, M
D. Noble, F
E. Harrison, M
The problem is that I should not use String Split function. Instead I have to do it using any other way of string handling.
This is my code:
Public Sub btn_IniSurGen_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles btn_IniSurGen.Click
Dim vFileName As String = "C:\temp\members.txt"
Dim vText As String = String.Empty
If Not File.Exists(vFileName) Then
lbl_Output.Text = "The file " & vFileName & " does not exist"
Else
Dim rvSR As New IO.StreamReader(vFileName)
Do While rvSR.Peek <> -1
vText = rvSR.ReadLine() & vbNewLine
lbl_Output.Text += vText.Substring(8, 1)
Loop
rvSR.Close()
End If
End Sub
You can use the TextFieldParserClass. It will parse the file and return the results directly to you as a string array.
Using MyReader As New Microsoft.VisualBasic.FileIO.
TextFieldParser("c:\logs\bigfile")
MyReader.TextFieldType =
Microsoft.VisualBasic.FileIO.FieldType.Delimited
MyReader.Delimiters = New String() {","}
Dim currentRow As String()
'Loop through all of the fields in the file.
'If any lines are corrupt, report an error and continue parsing.
While Not MyReader.EndOfData
Try
currentRow = MyReader.ReadFields()
' Include code here to handle the row.
Catch ex As Microsoft.VisualBasic.FileIO.MalformedLineException
MsgBox("Line " & ex.Message &
" is invalid. Skipping")
End Try
End While
End Using
For your wanted result, you may changed
lbl_Output.Text += vText.Substring(8, 1)
to
'declare this first
Dim sInit as String
Dim sName as String
sInit = vText.Substring(6, 1)
sName = ""
For x as Integer = 8 to vText.Length - 1
if vText.Substring(x) = "," Then Exit For
sName &= vText.Substring(x)
Next
lbl_Output.Text += sName & ", " & sInit
But better you have more than one lbl_Output ...
Something like this should work:
Dim lines As New List(Of String)
For Each s As String In File.ReadAllLines("textfile3.txt")
Dim temp As String = ""
s = s.Substring(s.IndexOf(","c) + 1)
temp = ", " + s.First
s = s.Substring(s.IndexOf(","c) + 1)
temp = s.First + ". " + s.Substring(s.IndexOf(" "c), s.IndexOf(","c) - s.IndexOf(" "c)) + temp
lines.Add(temp)
Next
The list Lines will contain the strings you need.