The following code I have reads a tab delimited file into a DataGridView. It works fine, but there are a couple of issues I'm not exactly sure how to address.
Dim query = From line In IO.File.ReadAllLines("C:\Temp\Temp.txt")
Let Data = line.Split(vbTab)
Let field1 = Data(0)
Let field2 = Data(1)
Let field3 = Data(2)
Let field4 = Data(3)
DataGridView1.DataSource = query.ToList
DataGridView1.Columns(0).Visible = False
How do I go about adding fields (columns) based on the number of fields in the header row? The header row currently contains 110 fields, which I'd hate to define in a similar manner to Let field1 = Data(0)
I'd also need to skip the header row and only display the lines after this.
Is there a better way to handle this then what I'm currently doing?
There are several tools to parse this type of file. One is OleDB.
I cant quite figure out how the (deleted) answer works because, HDR=No; tells the Text Driver the first row does not contain column names. But this is sometimes ignored after it reads the first 8 lines without IMEX.
However, FMT=Delimited\""" looks like it was copied from a C# answer because VB doesnt use \ to escape chars. It also looks like it is confusing the column delimiter (comma or tab in this case) and text delimiter (usually ")
If the file is tab delimited, the correct value would be FMT=TabDelimited. I am guessing that the fields are text delimited with quotes (e.g. "France" "Paris" "2.25") and OleDB is chopping the data by quotes rather than tabs to accidentally get the same result.
The correct ACE string would be:
Dim connstr = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source='C:\Temp';Extended Properties='TEXT;HDR=Yes;FMT=TabDelimited';"
Using just the connection string will import each filed as string. You can also have OleDB convert the data read to whatever datatype it is meant to be so that you do not have to litter your code with lots of Convert.ToXXXX to convert the String data to whatever.
This requires using a Schema.INI to define the file. This replaces most of the Extended Properties in the connection string leaving only Extended Properties='TEXT';" (which means use the TEXT Driver). Create a file name Schema.INI in the same folder as the data:
[Capitals.txt]
ColNameHeader=True
CharacterSet=437
Format=TabDelimited
TextDelimiter="
DecimalSymbol=.
CurrencySymbol=$
Col1="Country" Text Width 254
Col2="Capital City" Text Width 254
Col3="Population" Single
Col4="Fake" Integer
One Schema.INI can contain the layout for many files. Each file has its own section titled with the name of the file (e.g. [FooBar.CSV], [Capitals.txt]etc)
Most of the entries should be self-explanatory, but FORMAT defines the column delimiter (TabDelimited, CSVDelimited or custom Delimited(;)); TextDelimiter is the character is used to enclose column data when it might contain spaces or other special characters. Things like CurrencySymbol lets you allow for a foreign symbol and can be omitted.
The ColN= listings are where you can rename columns and specify the datatype. This might be tedious to enter for 100+ columns, however it would probably be mostly copy and paste. Once it is done you'd always have it and be able to easily use typed data.
You do not need to specify the column names/size/type to use a Schema.INI If the file includes column names as the first row (ColNameHeader=True), you can use the Schema simply to specify the various parameters in a clear and readable fashion rather than squeezing them into the connection string.
OleDB looks for a Schema.INI in the same folder as the import file, and then looks for a section bearing the exact name of the "table" used in the SQL:
' form level DT var
Private capDT As DataTable
' procedure code to load the file:
Dim connstr = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source='C:\Temp';Extended Properties='TEXT';"
Dim SQL = "SELECT * FROM Capitals.txt"
capDT = New DataTable
' USING will close and dispose of resources
Using cn As New OleDbConnection(connstr),
cmd As New OleDbCommand(SQL, cn)
cn.Open()
Using da As New OleDbDataAdapter(cmd)
da.Fill(capDT)
End Using
End Using ' close and dispose
The DataTable is now ready to use. If we iterate the columns, you can see they match the Type specified in the schema:
' display data types
For n As Int32 = 0 To capDT.Columns.Count - 1
Console.WriteLine("name: {0}, datatype: {1}",
capDT.Columns(n).ColumnName,
capDT.Columns(n).DataType.ToString)
Next
Output:
name: Country, datatype: System.String
name: Capital City, datatype: System.String
name: Population, datatype: System.Single
name: Fake, datatype: System.Int32
See also:
Schema.INI for most legal settings
Code Page Identifiers for the values to use for CharacterSet
Related
I have a large CSV file to import (and has to import on a regular basis). The biggest problem is that one of the fields contains descriptions that use double-quotes. So a field might have a value inside the raw csv file like this:
...,...,"100/5 OZ 5/8" x 1/2" x 1/2" Cube",...,...
I currently have simple ADO code that pulls the CSV into a table, but that won't work because of these double quotes :( Currently I am just trying to pull the CSV into a datatable, but the final effort will be to push into a SQL Server table.
Simple ADO Code:
Dim cnStr = "Provider=Microsoft.Jet.OLEDB.4.0;Datasource='C:\csv\';Extended Properties='text;HDR=No;FMT=Delimited';"
Dim dt as new Datatable
Using tblAdp as new OleDBDataAdapter("Select * from [ZMES.csv]",cnStr
tblAdp.fill(dt)
End Using
DataGridView1.DataSource = dt
When it hits those " in the midst of the string, it truncates that row and moves to the next. I need to do something similar to what I would do inserting quotes into a database in php (escaping the double quotes), but not real sure how. As a test, I also tried LumenWorks CSV Reader and it faulted out on those lines.
I am using the below code to import a CSV file to my Access DB. I just have a couple of questions.
Con.Open()
Dim strSqlCommand = "SELECT F1 AS id, F2 AS firstname " &
"INTO MyNewTable " &
"FROM [Text;FMT=Delimited;HDR=No;CharacterSet=850;DATABASE=" & GlobalVariables.strDefaultDownloadPath & "].Airports.csv;"
Dim sqlCommand = New System.Data.OleDb.OleDbCommand(strSqlCommand, Con)
sqlCommand.ExecuteNonQuery()
Con.Close()
How can I change the Character Set to UTF-8? If I enter utf8 instead of 850 I get an error.
Also, the first line of my CSV file contains the column names. Can I amend the above code to take that in to account?
Regards,
Andrew
You could run into trouble trying to import and select all at once, for one thing you may not want to leave converting data types up to Access. For that, you will need 2 connections and SQL string to select from one another to insert into the other.
The connection string will need to look something like this:
"Provider=Microsoft.Jet.OLEDB.4.0; Data Source=C:\Temp\Tmp;Extended Properties='TEXT;HDR=Yes;FMT=Delimited;CharacterSet=ANSI'"
Note that just the path is listed and the Extended Properties are enclosed in ticks. If the first line has headers/field names then HDR=Yes will skip them in the result set. One of the benefits of having field names as the first line is that OleDB will use them as column names (no need for F1 As foo, F2 As bar; in fact that will fail because they have been renamed from F1, F2...).
The SQL to read from the CSV:
"SELECT * FROM filename.csv"
There are several ways to process it. You could use a reader to read a row at a time to INSERT them into the Access database. This is probably simpler: get all the data from the CSV into a DataTable and use it to INSERT into Access:
Private myDT As DataTable ' form level variable
...
Dim csvStr As String = "Provider=Microsoft.Jet.OLEDB.4.0; Data Source=C:\Temp\Tmp;Extended Properties='TEXT;HDR=Yes;FMT=Delimited;CharacterSet=ANSI'"
Dim csvSQL = "SELECT * FROM Capitals.csv" ' use YOUR file name
Using csvCn = New OleDbConnection(csvStr),
cmd As New OleDbCommand(csvSQL, csvCn)
Using da As New OleDbDataAdapter(cmd)
myDT = New DataTable
da.Fill(myDT)
End Using
End Using
For Each r As DataRow In myDT.Rows
'ToDo: INSERT INTO Access
Next
The Connection, Command and DataAdapter are all resources, so they are in USING blocks to dispose of them when we are done with them. myDT will have a collection of Rows, each with a collection of Items representing the fields from the CSV. Just loop thru the rows adding the desired items to the Access DB.
You will very likely have to do same data type conversion from String to Integer or DateTime etc.
As for the question about UTF8 - you can use the Codepage identifier. If you leave it off the connection string it will use whatever is in the Registry which may also work. For UTF8 use CharacterSet=65001.
I am using oledb to get data from .txt file and i have encountered error.
Dim oleDB = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=\\CompName\C$\Path;ExtendedProperties="Text;HDR=Yes;FMT=Fixed""
where CompName and Path are real values
I get Unspecified error in adapter fill
Using connection As New OleDbConnection(oleDb)
Using command As New OleDbCommand(sql, connection)
Using adapter As New OleDbDataAdapter(command)
adapter.Fill(s)
End Using
End Using
End Using
Return s
End Function
anyone tried geting data across intranet from different computer using oledb?
To use the OleDb Text Driver with a text file formatted with fixed length columns you need to have a SCHEMA.INI file in the same folder where the text files are located.
The SCHEMA.INI allows to define various property for the text file like Format, Field Names, Widths and Types, character sets and some conversion rules.
From MSDN
When the Text driver is used, the format of the text file is
determined by using a schema information file. The schema information
file is always named Schema.ini and always kept in the same directory
as the text data source. The schema information file provides the
IISAM with information about the general format of the file, the
column name and data type information, and several other data
characteristics. A Schema.ini file is always required for accessing
fixed-length data. You should use a Schema.ini file when your text
table contains DateTime, Currency, or Decimal data, or any time that
you want more control over the handling of the data in the table.
More details about the SCHEMA.INI file could be found on this MSDN page
I have this txt file with the following information:
National_Insurence_Number;Name;Surname;Hours_Worked;Price_Per_Hour so:
eg.: aa-12-34-56-a;Peter;Smith;36;12
This data has been inputed to the txt file through a VB form which works totally fine, the problem comes when, on another form. This is what I expect it to do:
The user will input into a text box the employees NI Number.
The program will then search through the file that NI Number and, if found;
It will fill in the appropriate text boxes with its data.
(Then the program calculates tax and national insurance which i got working fine)
So basically the problem comes telling the program to search that NI number and introduce each ";" delimited field into its corresponding text box.
Thanks for all.
You just need to parse the file like a csv, you can use Microsoft.VisualBasic.FileIO.TextFieldParser to do this or you can use CSVHelper - https://github.com/JoshClose/CsvHelper
I've used csv helper in the past and it works great, it allows you to create a class with the structure of the records in your data file then imports the data into a list of these for searching.
You can look here for more info on TextFieldParser if you want to go that way -
Parse Delimited CSV in .NET
Dim afile As FileIO.TextFieldParser = New FileIO.TextFieldParser(FileName)
Dim CurrentRecord As String() ' this array will hold each line of data
afile.TextFieldType = FileIO.FieldType.Delimited
afile.Delimiters = New String() {";"}
afile.HasFieldsEnclosedInQuotes = True
' parse the actual file
Do While Not afile.EndOfData
Try
CurrentRecord = afile.ReadFields
Catch ex As FileIO.MalformedLineException
Stop
End Try
Loop
I'd recommend using CsvHelper though, the documentation is pretty good and working with objects is much easier opposed to the raw string data.
Once you have found the record you can then manually set the text of each text box on your form or use a bindingsource.
I am reading csv file via streamreader. Issue is that in csv file if if the data is like "Read" then steamreader the same data is coming as ""Read"". How to remove this extra inverted commas?
It sounds like you're dealing with a CSV that has some (or all) of its fields quoted. If that's the case, I'd recommend using the Microsoft.VisualBasic.FileIO.TextFieldParser (which a lot of people don't seem to know about, and yes despite the namespace it can be used with C#).
Imports Microsoft.VisualBasic.FileIO.TextFieldParser;
Dim csvString As String = "25,""This is text"",abdd,""more quoted text"""
Dim parser as TextFieldParser = New TextFieldParser(New StringReader(csvString))
' You can also read from a file
' Dim parser As TextFieldParser = New TextFieldParser("mycsvfile.csv")
parser.HasFieldsEnclosedInQuotes = True
parser.SetDelimiters(",")
Dim fields As String()
While Not parser.EndOfData
fields = parser.ReadFields()
For Each (field As String in fields)
Console.WriteLine(field)
Next
End While
parser.Close()
The output should be:
25
This is text
abdd
more quoted text
Microsoft.VisualBasic.FileIO.TextFieldParser
To Import this, you'll need to add a reference to Microsoft.VisualBasic to your project.