I am trying to import a large array of integers stored as a csv file into a VB.Net DataTable called BeamMap. The .csv file consists only of integers, with a delimiter of ,, no quotes around the data (ie., 1,3,-2,44,1), and an end of line character of line feed and carriage return. All I want to do is get each integer into a DataTable cell with the appropriate rows and columns (there are the same number of columns for each row) and be able to reference it later on in my code. I really don't want anything more than absolutely necessary in the code (no titles, captions, headings, etc.), and I need it to be fairly efficient (the csv array is approx. ~1000 x ~1000).
Thanks!
Use OleDb provider to read CSV and pouplate the DataTable.
Dim folder = "c:\location\of\csv\files\"
Dim CnStr = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" & folder & ";Extended Properties=""text;HDR=No;FMT=Delimited"";"
Dim dt As New DataTable
Using Adp As New OleDbDataAdapter("select * from [nos.csv]", CnStr)
Adp.Fill(dt)
End Using
Here's a simple approach which requires a strict format (as you've mentioned):
Dim lines = IO.File.ReadAllLines(path)
Dim tbl = New DataTable
Dim colCount = lines.First.Split(","c).Length
For i As Int32 = 1 To colCount
tbl.Columns.Add(New DataColumn("Column_" & i, GetType(Int32)))
Next
For Each line In lines
Dim objFields = From field In line.Split(","c)
Select CType(Int32.Parse(field), Object)
Dim newRow = tbl.Rows.Add()
newRow.ItemArray = objFields.ToArray()
Next
Getting the file from a mapped drive and putting the retrieved data in a dataset:
Dim folder = "Z:\"
Dim CnStr = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" & folder & ";Extended Properties=""text;HDR=No;FMT=Delimited"";"
Dim dssample As New DataSet
Using Adp As New OleDbDataAdapter("select * from [samplecsv.csv]", CnStr)
Adp.Fill(dssample)
End Using
If dssample.Tables.Count > 0 Then
'some code here
End If
Also, don't forget to include the
Imports System.Data.OleDb
And if you wish to link to a DataGridView (after read):
Dim bs As New BindingSource
bs.DataSource = dt
DataGridView1.DataSource = bs
Related
I have a .CSV file that im filling a datatable in my application with.
one of the cells is a number, for example: 5720358152
these values seem to be getting skipped and nothing is put into the datatable.
here is my code:
Dim csvFile As String = "I:\STOCK.csv"
Dim folder = "I:\"
Dim csvStr As String = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" & folder & ";Extended Properties=""text;HDR=No;FMT=Delimited"";"
Dim csvSQL = "SELECT * FROM [STOCK.csv]"
Dim MyDT As DataTable
Using csvCn = New OleDbConnection(csvStr),
cmd As New OleDbCommand(csvSQL, csvCn)
Using da As New OleDbDataAdapter(cmd)
MyDT = New DataTable
da.Fill(MyDT)
End Using
End Using
StockTakeDGV.DataSource = MyDT
Here is a list of numbers that get left out.
5720358152
5720358150
5720358146
5720350121
5720324303
5720308119
5720308118
5720308115
5720308114
5720308110
5720308104
But these numbers are fine:
4021021135
4021021132
4021021126
1320203187
1320023154
at first i thought it might have been the number of digits but other numbers with this many digits work, but i assume it's more to do with the "type" of number and that the value exceeds the limitation of that number type.
how do i overcome this problem?
I have an app where a user uploads a spreadsheet and specifies the sheetname and row number for the header row. I need the app to extract the column names from that specified row. I was able to get it to work returning the top row. How would i speficy that the column names i want should be on row(x)
Dim ExcelConn As System.Data.OleDb.OleDbConnection
Dim ExcelTable As DataTable = Nothing
Dim dr As DataRow
Dim sheet_found As Boolean = False
ExcelConn = New system.Data.OleDb.OleDbConnection("Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" & file & ";Extended Properties=Excel 12.0;")
End If
'open the file
ExcelConn.Open()
ExcelTable = ExcelConn.GetOleDbSchemaTable(System.Data.OleDb.OleDbSchemaGuid.Tables, New Object() {Nothing, Nothing, Nothing, "Table"})
'make sure there is a matching sheet name
For Each dr In ExcelTable.Rows
If dr("TABLE_NAME").ToString() = sheet & "$" Then
sheet_found = True
Exit For
End If
Next
If sheet_found = False Then
MesgBox("the sheet name specified in the header (" + sheet + ") was not found")
ExcelConn.Close()
Exit Sub
Else
Dim sheet_name As String = Nothing
sheet_name = "[" & sheet & "$]"
Dim cmd1 As New System.Data.OleDb.OleDbCommand("Select * From " & sheet_name, ExcelConn)
Dim da As New OleDbDataAdapter("Select * From " & sheet_name, ExcelConn)
Dim ds As DataSet = New DataSet()
Dim dc As DataColumn
da.Fill(ds)
For Each dc In ds.Tables(0).Columns 'this returns col names fine from first row. how would i tell it to get names from 2nd or 3rd row, etc. The integer var is passed in. i just need to know how to specify that it is row(x)
header_row = LCase(RTrim(header_row + "|" + dc.ColumnName))
Next
MsgBox(header_row)
ExcelConn.Close()
End If
as far as i know (checked that issue in the past) there is no way to select a table with System.Data.OleDb from excel file using SQL query if headers are not placed in row 1. the solution for me is to delete all the rows above the header row before querying the worksheet - just opening the workbook with Microsoft.Office.Interop deleting the extra rows, closing it and than querying it.
Excel is a very powerful tool but was never designed to behave like database (sql server or access file for example).
There are some known limitations to use the JET/ACE drivers to access data in Excel sheets, as jonathana has pointed out.
As an alternative, I'd like to offer our Excel ADO.NET Provider. With it, you get all of the SQL access to your Excel data that you're accustomed to from the JET/ACE drivers, but with more flexibility in how that data is arranged in Excel.
In your example, you could submit a query like the following to denote that the headers are placed in row 4:
SELECT * FROM Sheet1#A4:**
Using our provider, your code would look similar to the following:
Dim ExcelConn As System.Data.CData.Excel.ExcelConnection
Dim ExcelTable As DataTable = Nothing
Dim dr As DataRow
Dim sheet_found As Boolean = False
ExcelConn = New System.Data.CData.Excel.ExcelConnection("Excel File=" & file & ";")
'open the file
ExcelConn.Open()
ExcelTable = ExcelConn.GetSchema("Tables")
'make sure there is a matching sheet name
For Each dr In ExcelTable.Rows
If dr("Table_Name").ToString() = sheet Then
sheet_found = True
Exit For
End If
Next
If sheet_found = False Then
MesgBox("the sheet name specified in the header (" + sheet + ") was not found")
ExcelConn.Close()
Exit Sub
Else
Dim sheet_name As String = Nothing
'Here, I assume that header_row indicates the row that contains the headers
sheet_name = "[" & sheet & "#A" & header_row & ":**]"
Dim cmd1 As New System.Data.CData.Excel.ExcelCommand("Select * From " & sheet_name, ExcelConn)
Dim da As New System.Data.CData.Excel.ExcelDataAdapter("Select * From " & sheet_name, ExcelConn)
Dim ds As DataSet = New DataSet()
Dim dc As DataColumn
da.Fill(ds)
For Each dc In ds.Tables(0).Columns
'I wasn't sure what this code was meant to accomplish, but at this point,
'dc.ColumnName contains the column names from header_row
Next
ExcelConn.Close()
End If
We have a blog post on our site with more information on our provider and you can download a free trial from our site as well.
I'm being presented with a CSV file that i would like to import to a datatable. The challenge I have is that the file has 2 different delimters. The first few columns are delimited with a "tab" and the rest with a";". I can handle the one easily but not sure how to handle both. The code that I have so far but struggling to find a way to expand this to import it the single step:
Public Function LoadFileToDatatable(ByVal FullFilePath As String)
'Load the Testfile into an datatable
Dim folder As String = System.IO.Path.GetDirectoryName(FullFilePath)
Dim filename As String = System.IO.Path.GetFileName(FullFilePath)
Dim con = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" & folder & ";Extended Properties=""text;HDR=No;FMT=Delimited"";"
Dim dt As New DataTable
Using Adp As New OleDbDataAdapter("Select * From " & filename, con)
Adp.Fill(dt)
'Remove the first row as it contains the header data
Dim theRow As DataRow = dt.Rows(0)
dt.Rows.Remove(theRow)
End Using
Return dt
End Function
(I don't need alternatives to OleDbDataAdapter.)
The code below finds and reads the file OK but the DGV has four columns (as expected) but all the data rows just have text in the first column.
Dim sDir As String = "c:\temp\"
Dim sConn = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" & sDir & ";Extended Properties='text;HDR=Yes;FMT=TabDelimited';"
Dim dt As New DataTable()
Using adapt As New OleDbDataAdapter(String.Format("SELECT TOP 100 * FROM robo.txt"), sConn)
adapt.Fill(dt)
End Using
DataGridView1.DataSource = dt
I would think the Extended Properties would be the only requirement. I've tried add a Schema.ini to no avail - I don't think it is even being read as the column headers never match the schema.
The header row in the most successful pass used commas as separator - this resulted in four columns with the proper names but the tab separated data all in Col1. If I use tabs in the header row I get some system assign columns (3) which sort of corresponds to a data row with two commas.
What am I doing wrong?
Here are the first few rows with the tab character being replaced by <tab> . I since noticed that I have an extra column in the data. The fix to the header row below did not fix the problem - all data is dumped into the first field.
Use a tab separator in the header, instead of commas, results in all header text and the data being dumped into the first field.
col1,state,col3,size,path
<tab> same<tab><tab> 102912<tab>\\APCD04T\Data\Thumbs.db
<tab> same<tab><tab> 22016<tab>\\APCD04T\Data\APCD Topical Info\APCD_Boards&Committees_List.doc
<tab> same<tab><tab> 4.3 m<tab>\\APCD04T\Data\APCD Topical Info\LOSSAN-LAtoSLORailCorridorStrategicPlan.pdf
Learned several things while trying to load a RoboCopy log into a DataTable using OLEDB.
log file needs to have a .txt or .csv (or ?) extension, .log fails.
Schema.ini seems to be needed for tab delimited robocopy log, good for column definition anyway.
Datagridview takes a long time to display 30MB of data so I used
filters
I borrowed code from the net to create a Schema.ini as noted below
(SO bug: code will not paste from Visual Studio anymore. Code tool flips to other web page for Java.)
Private Sub Button1_Click(sender As Object, e As EventArgs) Handles Button1.Click
Try
Cursor = Cursors.WaitCursor
'http://ss64.com/nt/robocopy.html can suppress header and summary
Dim sFile As String = "c:\temp\robo.txt" ' seems to need a .txt or .csv, .log didn't work
CreateRoboLogSchema(sFile) ' recreates each pass, no needed once things work
Dim sConn = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" & IO.Path.GetDirectoryName(sFile) & ";Extended Properties='text';"
' use Schema.ini for: HDR=Yes;FMT=TabDelimited' and column definitions
Dim dt As New DataTable()
Dim SQL As String = "SELECT * FROM " & IO.Path.GetFileName(sFile)
'SQL &= " WHERE State <> 'Same'"
Using adapt As New OleDbDataAdapter(SQL, sConn)
adapt.Fill(dt)
End Using
Debug.Print("|" & dt.Rows(0)(1) & "|") ' show import trimmed leading spaces (trims trailing too)
' DGV slow to load large files, use filter to display target rows
Dim dv As New DataView(dt)
dv.RowFilter = "State <> 'Same'" ' not case sensitive
DataGridView1.DataSource = dv
DataGridView1.Columns(0).Visible = False
DataGridView1.AutoResizeColumns()
Catch ex As Exception
MsgBox(ex.Message)
Finally
'Cursor=Cursors.Default
End Try
End Sub
Private Function CreateRoboLogSchema(ByVal strFileName As String) As Boolean
' edit http://www.vb-tips.com/CSVDataSet.aspx
Dim ascii As System.Text.Encoding = System.Text.Encoding.ASCII
Dim swSchema As System.IO.StreamWriter = Nothing
Dim blnReturn As Boolean
Dim strSchemaPath As String = System.IO.Path.GetFileName(strFileName)
Try
strSchemaPath = IO.Path.GetDirectoryName(strFileName) & "\Schema.ini"
swSchema = My.Computer.FileSystem.OpenTextFileWriter(strSchemaPath, False, ascii)
Dim strFile As String = System.IO.Path.GetFileName(strFileName)
swSchema.WriteLine("[" & IO.Path.GetFileName(strFileName) & "]")
swSchema.WriteLine("ColNameHeader=False")
swSchema.WriteLine("Format=TabDelimited")
swSchema.WriteLine("Col1=Value1 Text") ' file specific
swSchema.WriteLine("Col2=State Text")
swSchema.WriteLine("Col3=DirChanges Text")
swSchema.WriteLine("Col4=Size Text")
swSchema.WriteLine("Col5=Filepath Text")
'Continue for all fields
blnReturn = True
Catch ex As Exception
blnReturn = False
Finally
If swSchema IsNot Nothing Then
swSchema.Close()
End If
End Try
Return blnReturn
End Function
Alright, I finally got this code to work after hours of toiling:
Dim path As String = OpenFileDialog1.FileName
Dim myDataset As New DataSet()
Dim strConn = New OleDbConnection("Provider=Microsoft.ACE.Oledb.12.0;Data Source=" & path & ";Extended Properties=""Excel 12.0;HDR=YES;IMEX=1""")
Dim myData As New OleDb.OleDbDataAdapter("SELECT * FROM [Sheet1$]", strConn)
myData.Fill(myDataset)
DataGridView1.DataSource = myDataset.Tables(0).DefaultView
Now that I figured that out I was going to try and place the data in a specific location. On my application I have a datagridview set up with 4 columns. What I would like to do is put column A of the excel file under the 1st column of the datagridview and column C of the Excel File in the second column of the datagridview.
So replace:
DataGridView1.DataSource = myDataset.Tables(0).DefaultView
with:
DataGridView1.columns(0) = myDataset.Tables(0).columns(0)
DataGridView1.columns(1) = myDataset.Tables(0).columns(2)
Obviously this doesnt work, and something tells me I might need a for loop to import the data, but I have never imported information from an Excel file before and to make it worse I have never worked with datagridviews before so I have no idea how to go about this.
I would like to do something like this if I could:
For x = 1 To xldoc.rows.length - 1
DataGridView1.Item(0, x).Value = CType(xlDoc.Cells(0, x + 1), Excel.Range).Text
Next
This ended up being way easier to import the data. Im posting this in case anyone else comes across this thread.
If OpenFileDialog1.ShowDialog = Windows.Forms.DialogResult.OK Then
xLApp = New Excel.Application
xLBook = xLApp.Workbooks.Open(OpenFileDialog1.FileName)
xLSheet = xLBook.Worksheets("Sheet1")
For x = 1 To xLSheet.UsedRange.Rows.Count - 1
DataGridView1.Rows.Add()
DataGridView1.Item(0, x - 1).Value = xLSheet.Cells(1 + x, 1).value
DataGridView1.Item(1, x - 1).Value = xLSheet.Cells(1 + x, xLSheet.UsedRange.Columns.Count).value
Next
End If
Dont even bother with:
Dim myDataset As New DataSet()
Dim strConn = New OleDbConnection("Provider=Microsoft.ACE.Oledb.12.0;Data Source=" & path & ";Extended Properties=""Excel 12.0;HDR=YES;IMEX=1""")
Dim myData As New OleDb.OleDbDataAdapter("SELECT * FROM [Sheet1$]", strConn)
Think of it roughly as follows:
Excel Work book = Database
Excel Work sheet = Table
Each Excel column = Table column
Each Excel row = Table row
Excel cell = a particular column value in a particular row
If your Excel has column headers, those are your field names. Now change your SQL query to select the columns you want and bind as usual.