I need to write 50 million records with 72 columns into text file, the file size is growing as 9.7gb .
I need to check each and every column length need to format as according to the length as defined in XML file.
Reading records from oracle one by one and checking the format and writing into text file.
To write 5 crores records it is taking more than 24 hours. how to increase the performance in the below code.
Dim valString As String = Nothing
Dim valName As String = Nothing
Dim valLength As String = Nothing
Dim valDataType As String = Nothing
Dim validationsArray As ArrayList = GetValidations(Directory.GetCurrentDirectory() + "\ReportFormat.xml")
Console.WriteLine("passed xml")
Dim k As Integer = 1
Try
Console.WriteLine(System.DateTime.Now())
Dim selectSql As String = "select * from table where
" record_date >= To_Date('01-01-2014','DD-MM-YYYY') and record_date <= To_Date('31-12-2014','DD-MM-YYYY')"
Dim dataTable As New DataTable
Dim oracleAccess As New OracleConnection(System.Configuration.ConfigurationManager.AppSettings("OracleConnection"))
Dim cmd As New OracleCommand()
cmd.Connection = oracleAccess
cmd.CommandType = CommandType.Text
cmd.CommandText = selectSql
oracleAccess.Open()
Dim Tablecolumns As New DataTable()
Using oracleAccess
Using writer = New StreamWriter(Directory.GetCurrentDirectory() + "\FileName.txt")
Using odr As OracleDataReader = cmd.ExecuteReader()
Dim sbHeaderData As New StringBuilder
For i As Integer = 0 To odr.FieldCount - 1
sbHeaderData.Append(odr.GetName(i))
sbHeaderData.Append("|")
Next
writer.WriteLine(sbHeaderData)
While odr.Read()
Dim sbColumnData As New StringBuilder
Dim values(odr.FieldCount - 1) As Object
Dim fieldCount As Integer = odr.GetValues(values)
For i As Integer = 0 To fieldCount - 1
Dim vals As Array = validationsArray(i).ToString.ToUpper.Split("|")
valName = vals(0).trim
valDataType = vals(1).trim
valLength = vals(2).trim
Select Case valDataType
Case "VARCHAR2"
If values(i).ToString().Length = valLength Then
sbColumnData.Append(values(i).ToString())
'sbColumnData.Append("|")
ElseIf values(i).ToString().Length > valLength Then
sbColumnData.Append(values(i).ToString().Substring(0, valLength))
'sbColumnData.Append("|")
Else
sbColumnData.Append(values(i).ToString().PadRight(valLength))
'sbColumnData.Append("|")
End If
Case "NUMERIC"
valLength = valLength.Substring(0, valLength.IndexOf(","))
If values(i).ToString().Length = valLength Then
sbColumnData.Append(values(i).ToString())
'sbColumnData.Append("|")
Else
sbColumnData.Append(values(i).ToString().PadLeft(valLength, "0"c))
'sbColumnData.Append("|")
End If
'sbColumnData.Append((values(i).ToString()))
End Select
Next
writer.WriteLine(sbColumnData)
k = k + 1
Console.WriteLine(k)
End While
End Using
writer.WriteLine(System.DateTime.Now())
End Using
End Using
Console.WriteLine(System.DateTime.Now())
'Dim Adpt As New OracleDataAdapter(selectSql, oracleAccess)
'Adpt.Fill(dataTable)
Return Tablecolumns
Catch ex As Exception
Console.WriteLine(System.DateTime.Now())
Console.WriteLine("Error: " & ex.Message)
Console.ReadLine()
Return Nothing
End Try
Public connstring As String = "Provider = Microsoft.ACE.Oledb.12.0; Data Source = C:\Users\blablabla\Document\Visual Studio 2013\Project\rentalsystem\rental_db.accdb"
i placed this line of code under a module so i wont have to call it everytime. and this is the code that the code above is connected to...
Public Sub loadLVusers()
LVusers.Items.Clear()
Dim sqlcode As String = "select id_code, user_lastname, user_firstname, user_midname, user_username, user_password, user_email, user_privilege from tblUsers"
Dim sqlcomd As New OleDb.OleDbCommand(sqlcode)
sqlcomd.Connection = New OleDb.OleDbConnection(connstring)
sqlcomd.Connection.Open()
Dim DA As New OleDb.OleDbDataAdapter(sqlcomd)
Dim DS As New DataSet
DA.Fill(DS, "Pi")
If DS.Tables("Pi").Rows.Count > 0 Then
Dim Ic(100) As String
For r = 0 To DS.Tables("Pi").Rows.Count - 1
For c = 0 To DS.Tables("Pi").Columns.Count - 1
Ic(c) = DS.Tables("Pi").Rows(r)(c).ToString
Next
Dim LVI As New ListViewItem(Ic)
LVusers.Items.Add(LVI)
Next
End If
End Sub
now when the form/window that it is attached to loads, the form/window does not open. and then it highlights
sqlcomd.Connection = New OleDb.OleDbConnection(connstring)
so im guessing that has something to do with the file path format
So, quite simple.
I am importing CSVs into a datagrid, though the csv always has to have a variable amount of columns.
For 3 Columns, I use this code:
Dim sr As New IO.StreamReader("E:\test.txt")
Dim dt As New DataTable
Dim newline() As String = sr.ReadLine.Split(";"c)
dt.Columns.AddRange({New DataColumn(newline(0)), _
New DataColumn(newline(1)), _
New DataColumn(newline(2))})
While (Not sr.EndOfStream)
newline = sr.ReadLine.Split(";"c)
Dim newrow As DataRow = dt.NewRow
newrow.ItemArray = {newline(0), newline(1), newline(2)}
dt.Rows.Add(newrow)
End While
DG1.DataSource = dt
This works perfectly. But how do I count the number of "newline"s ?
Can I issue a count on the number of newlines somehow? Any other example code doesn't issue column heads.
If my csv file has 5 columns, I would need an Addrange of 5 instead of 3 and so on..
Thanks in advance
Dim sr As New IO.StreamReader(path)
Dim dt As New DataTable
Dim newline() As String = sr.ReadLine.Split(","c)
' MsgBox(newline.Count)
' dt.Columns.AddRange({New DataColumn(newline(0)),
' New DataColumn(newline(1)),
' New DataColumn(newline(2))})
Dim i As Integer
For i = 0 To newline.Count - 1
dt.Columns.AddRange({New DataColumn(newline(i))})
Next
While (Not sr.EndOfStream)
newline = sr.ReadLine.Split(","c)
Dim newrow As DataRow = dt.NewRow
newrow.ItemArray = {newline(0), newline(1)}
dt.Rows.Add(newrow)
End While
dgv.DataSource = dt
End Sub
Columns and item values can be added to a DataTable individually, using dt.Columns.Add and newrow.Item, so that these can be done in a loop instead of hard-coding for a specific number of columns. e.g. (this code assumes Option Infer On, so adjust as needed):
Public Function CsvToDataTable(csvName As String, Optional delimiter As Char = ","c) As DataTable
Dim dt = New DataTable()
For Each line In File.ReadLines(csvName)
If dt.Columns.Count = 0 Then
For Each part In line.Split({delimiter})
dt.Columns.Add(New DataColumn(part))
Next
Else
Dim row = dt.NewRow()
Dim parts = line.Split({delimiter})
For i = 0 To parts.Length - 1
row(i) = parts(i)
Next
dt.Rows.Add(row)
End If
Next
Return dt
End Function
You could then use it like:
Dim dt = CsvToDataTable("E:\test.txt", ";"c)
DG1.DataSource = dt
I have a datagrid which has two image column. I want to export the data to excel. And using this code
SaveFileDialog1.Filter = "Excel Files (*.xlsx*)|*.xlsx"
SaveFileDialog1.ShowDialog()
Dim filename As String = SaveFileDialog1.FileName
'verfying the datagridview having data or not
If ((DataGridView1.Columns.Count = 0) Or (DataGridView1.Rows.Count = 0)) Then
Exit Sub
End If
'Creating dataset to export
Dim dset As New DataSet
'add table to dataset
dset.Tables.Add()
'add column to that table
For i As Integer = 0 To DataGridView1.ColumnCount - 1
dset.Tables(0).Columns.Add(DataGridView1.Columns(i).HeaderText)
Next
'add rows to the table
Dim dr1 As DataRow
For i As Integer = 0 To DataGridView1.RowCount - 1
dr1 = dset.Tables(0).NewRow
For j As Integer = 0 To DataGridView1.Columns.Count - 1
Dim cj = DataGridView1.Rows(i).Cells(j).Value
If (cj.GetType = GetType(Byte())) Then
'Error = Publicmember 'Value' on type 'Integer' not found.
Dim data As Byte() = DirectCast(cj.Value, Byte())
Dim ms As New System.IO.MemoryStream(data)
Dim k As System.Drawing.Image = System.Drawing.Image.FromStream(ms)
dr1(j) = k
Else
dr1(j) = DataGridView1.Rows(i).Cells(j).Value
End If
Next
dset.Tables(0).Rows.Add(dr1)
Next
Dim excel As New Excel.Application()
Dim wBook As Microsoft.Office.Interop.Excel.Workbook
Dim wSheet As Microsoft.Office.Interop.Excel.Worksheet
wBook = excel.Workbooks.Add()
wSheet = wBook.ActiveSheet()
Dim dt As System.Data.DataTable = dset.Tables(0)
Dim dc As System.Data.DataColumn
Dim dr As System.Data.DataRow
Dim colIndex As Integer = 0
Dim rowIndex As Integer = 0
For Each dc In dt.Columns
colIndex = colIndex + 1
excel.Cells(1, colIndex) = dc.ColumnName
Next
For Each dr In dt.Rows
rowIndex = rowIndex + 1
colIndex = 0
For Each dc In dt.Columns
colIndex = colIndex + 1
excel.Cells(rowIndex + 1, colIndex) = dr(dc.ColumnName)
Next
Next
wSheet.Columns.AutoFit()
Dim strFileName As String = filename
Dim blnFileOpen As Boolean = False
Try
Dim fileTemp As System.IO.FileStream = System.IO.File.OpenWrite(strFileName)
fileTemp.Close()
Catch ex As Exception
blnFileOpen = False
End Try
If System.IO.File.Exists(strFileName) Then
System.IO.File.Delete(strFileName)
End If
wBook.SaveAs(strFileName)
excel.Workbooks.Open(strFileName)
excel.Visible = True
It is giving me error Public member 'Value' on type 'Integer' not found. Though this condition works with itextsharp i mean for PDF creation time. please help me. if remove that condition and run the code then it will create an excel file with image column as 'System.Byte[]'.
I get the feeling that you think that you're storing the grid cell in cj and then getting the Value of that cell but you have actually already got the Value of the cell and put that into cj. Get rid of the Value property access on the following line and your code will likely work:
Dim cj = DataGridView1.Rows(i).Cells(j).Value
By the way, don't do this:
If (cj.Value.GetType = GetType(Byte())) Then
It would be preferable to do this in isolation:
If TypeOf cj.Value Is Byte() Then
but, given that you're going to cast as that type if it is then you should use TryCast instead:
Dim data As Byte() = TryCast(cj.Value, Byte())
If data IsNot Nothing Then
'Use data here.
End If
TryCast is like DirectCast except that it returns Nothing if the cast fails rather than throwing an exception.
I have an interesting conundrum here, how do I quickly (under 1 minute) export a large datatable (filled from SQL, 35,000 rows) into an Excel spreadsheet for users. I have code in place that can handle the export, and while nothing is "wrong" with the code per se, it is infuriatingly slow taking 4 minutes to export the entire file (sometimes longer if a user has less RAM or is running more on their system). Sadly, this is an improvement over the 10+ minutes it used to take using our old method. Simply put, can this be made any faster, without using 3rd party components? If so, how? My code is as follows, the slow down occurs between messageboxes 6 and 7 where each row is written. Thank you all for taking the time to take a look at this:
Private Sub btnTest_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles btnJeffTest.Click
Test(MySPtoExport)
End Sub
Private Sub Test(ByVal SQL As String)
'Declare variables used to execute the VUE Export stored procedure
MsgBox("start stop watch")
Dim ConnectionString As New SqlConnection(CType(ConfigurationManager.AppSettings("ConnString"), String))
Dim cmdSP As New SqlClient.SqlCommand
Dim MyParam As New SqlClient.SqlParameter
Dim MyDataAdapter As New SqlClient.SqlDataAdapter
Dim ExportDataSet As New DataTable
Dim FilePath As String
MsgBox("stop 1 - end of declare")
Try
' open the connection
ConnectionString.Open()
' Use the connection for this sql command
cmdSP.Connection = ConnectionString
'set this command as a stored procedure command
cmdSP.CommandType = CommandType.StoredProcedure
'get the stored procedure name and plug it in
cmdSP.CommandText = SQL
'Add the Start Date parameter if required
Select Case StDt
Case Nothing
' there's no parameter to add
Case Is = 0
' there's no parameter to add
Case Else
'add the parameter name, it's direction and its value
MyParam = cmdSP.Parameters.Add("#StartDate", SqlDbType.VarChar)
MyParam.Direction = ParameterDirection.Input
MyParam.Value = Me.txtStartDate.Text
End Select
MsgBox("stop 2 - sql ready")
'Add the End Date parameter if required
Select Case EdDt
Case Nothing
' there's no parameter to add
Case Is = 0
' there's no parameter to add
Case Else
'add the parameter name, it's direction and its value
MyParam = cmdSP.Parameters.Add("#EndDate", SqlDbType.VarChar)
MyParam.Direction = ParameterDirection.Input
MyParam.Value = Me.txtEndDate.Text
End Select
'Add the single parameter 1 parameter if required
Select Case SPar1
Case Is = Nothing
' there's no parameter to add
Case Is = ""
' there's no parameter to add
Case Else
'add the parameter name, it's direction and its value
MyParam = cmdSP.Parameters.Add(SPar1, SqlDbType.VarChar)
MyParam.Direction = ParameterDirection.Input
MyParam.Value = Me.txtSingleReportCrt1.Text
End Select
'Add the single parameter 2 parameter if required
Select Case Spar2
Case Is = Nothing
' there's no parameter to add
Case Is = ""
' there's no parameter to add
Case Else
'add the parameter name, it's direction and its value
MyParam = cmdSP.Parameters.Add(Spar2, SqlDbType.VarChar)
MyParam.Direction = ParameterDirection.Input
MyParam.Value = Me.txtSingleReportCrt2.Text
End Select
MsgBox("stop 3 - params ready")
'Prepare the data adapter with the selected command
MyDataAdapter.SelectCommand = cmdSP
' Set the accept changes during fill to false for the NYPDA export
MyDataAdapter.AcceptChangesDuringFill = False
'Fill the Dataset tables (Table 0 = Exam Eligibilities, Table 1 = Candidates Demographics)
MyDataAdapter.Fill(ExportDataSet)
'Close the connection
ConnectionString.Close()
'refresh the destination path in case they changed it
SPDestination = txtPDFDestination.Text
MsgBox("stop 4 - procedure ran, datatable filled")
Select Case ExcelFile
Case True
FilePath = SPDestination & lblReportName.Text & ".xls"
Dim _excel As New Microsoft.Office.Interop.Excel.Application
Dim wBook As Microsoft.Office.Interop.Excel.Workbook
Dim wSheet As Microsoft.Office.Interop.Excel.Worksheet
wBook = _excel.Workbooks.Add()
wSheet = wBook.ActiveSheet()
Dim dt As System.Data.DataTable = ExportDataSet
Dim dc As System.Data.DataColumn
Dim dr As System.Data.DataRow
Dim colIndex As Integer = 0
Dim rowIndex As Integer = 0
MsgBox("stop 5 - excel stuff declared")
For Each dc In dt.Columns
colIndex = colIndex + 1
_excel.Cells(1, colIndex) = dc.ColumnName
Next
MsgBox("stop 6 - Header written")
For Each dr In dt.Rows
rowIndex = rowIndex + 1
colIndex = 0
For Each dc In dt.Columns
colIndex = colIndex + 1
_excel.Cells(rowIndex + 1, colIndex) = dr(dc.ColumnName)
Next
Next
MsgBox("stop 7 - rows written")
wSheet.Columns.AutoFit()
MsgBox("stop 8 - autofit complete")
Dim strFileName = SPDestination & lblReportName.Text & ".xls"
If System.IO.File.Exists(strFileName) Then
System.IO.File.Delete(strFileName)
End If
MsgBox("stop 9 - file checked")
wBook.SaveAs(strFileName)
wBook.Close()
_excel.Quit()
End Select
MsgBox("File " & lblReportName.Text & " Exported Successfully!")
'Dispose of unneeded objects
MyDataAdapter.Dispose()
ExportDataSet.Dispose()
StDt = Nothing
EdDt = Nothing
SPar1 = Nothing
Spar2 = Nothing
MyParam = Nothing
cmdSP.Dispose()
cmdSP = Nothing
MyDataAdapter = Nothing
ExportDataSet = Nothing
Catch ex As Exception
' Something went terribly wrong. Warn user.
MessageBox.Show("Error: " & ex.Message, "Stored Procedure Running Process ", _
MessageBoxButtons.OK, MessageBoxIcon.Error)
Finally
'close the connection in case is still open
If Not ConnectionString.State = ConnectionState.Closed Then
ConnectionString.Close()
ConnectionString = Nothing
End If
' reset the fields
ResetFields()
End Try
End Sub
Even though the question was asked several years ago, I thought I would add my solution since the question was posed in VB and the "best answer" is in C#. This solution writes 22,000+ rows (1.9MB) in 4 seconds on an i7 System w/ 16GB RAM.
Imports Excel = Microsoft.Office.Interop.Excel
Public Class Main
Private Sub btnExportToExcel(sender As Object, e As EventArgs) Handles btnExpToExcel.Click
'Needed for the Excel Workbook/WorkSheet(s)
Dim app As New Excel.Application
Dim wb As Excel.Workbook = app.Workbooks.Add()
Dim ws As Excel.Worksheet
Dim strFN as String = "MyFileName.xlsx" 'must have ".xlsx" extension
'Standard code for filling a DataTable from SQL Server
Dim strSQL As String = "My SQL Statement for the DataTable"
Dim conn As New SqlConnection With {.ConnectionString = "My Connection"}
Dim MyTable As New DataTable
Dim cmd As New SqlCommand(strSQL, conn)
Dim da As New SqlDataAdapter(cmd)
da.Fill(MyTable)
'Add a sheet to the workbook and fill it with data from MyTable
'You could create multiple tables and add additional sheets in a loop
ws = wb.Sheets.Add(After:=wb.Sheets(wb.Sheets.Count))
DataTableToExcel(MyTable, ws, strSym)
wb.SaveAs(strFN) 'save and close the WorkBook
wb.Close()
MsgBox("Export complete.")
End Sub
Private Sub DataTableToExcel(dt As DataTable, ws As Excel.Worksheet, TabName As String)
Dim arr(dt.Rows.Count, dt.Columns.Count) As Object
Dim r As Int32, c As Int32
'copy the datatable to an array
For r = 0 To dt.Rows.Count - 1
For c = 0 To dt.Columns.Count - 1
arr(r, c) = dt.Rows(r).Item(c)
Next
Next
ws.Name = TabName 'name the worksheet
'add the column headers starting in A1
c = 0
For Each column As DataColumn In dt.Columns
ws.Cells(1, c + 1) = column.ColumnName
c += 1
Next
'add the data starting in cell A2
ws.Range(ws.Cells(2, 1), ws.Cells(dt.Rows.Count, dt.Columns.Count)).Value = arr
End Sub
End Class
Hope it helps.
As when using VBA to automate Excel, you can assign an array directly to the value of a Range object: this is done as a single operation, so you remove the overhead associated with making multiple calls across the process boundaries between your .Net code and the Excel instance.
Eg, see the accepted answer here: Write Array to Excel Range
The answer from CPRouse worked for me except that it left off the last row of data. In the Private Sub DataTableToExcel function, I added 1 to the rows.count on this line and it wrote all the records. ws.Range(ws.Cells(2, 1), ws.Cells(dt.Rows.Count + 1, dt.Columns.Count)).Value = arr
Here is a piece of my own code that performs a very fast export of data from a DataTable to an Excel sheet (use the "Stopwatch" object to compare the speed and let me a comment):
Dim _excel As New Excel.Application
Dim wBook As Excel.Workbook
Dim wSheet As Excel.Worksheet
wBook = _excel.Workbooks.Add()
wSheet = wBook.ActiveSheet()
Dim dc As System.Data.DataColumn
Dim colIndex As Integer = 0
Dim rowIndex As Integer = 0
'Nombre de mesures
Dim Nbligne As Integer = DtMesures.Rows.Count
'Ecriture des entêtes de colonne et des mesures
'(Write column headers and data)
For Each dc In DtMesures.Columns
colIndex = colIndex + 1
'Entête de colonnes (column headers)
wSheet.Cells(1, colIndex) = dc.ColumnName
'Données(data)
'You can use CDbl instead of Cobj If your data is of type Double
wSheet.Cells(2, colIndex).Resize(Nbligne, ).Value = _excel.Application.transpose(DtMesures.Rows.OfType(Of DataRow)().[Select](Function(k) CObj(k(dc.ColumnName))).ToArray())
Next
We had a VB.NET app that did exactly this, and took even longer for our users who were on slow PC's... sometimes 15 minutes.
The app is now an ASP/VB.NET app which simply builds an HTML table and outputs the result as an .xls extension... excel is able to read the HTML table and parse it into a grid format. You can still pass in XML for formatting and options, horizontal pane locking, etc.
If you don't have the option of using ASP.NET... try looking into a way to build an HTML table string and have excel parse & populate for you... much faster! I'm sure excel can parse other types as well.... XML, Arrays, HTML, etc... all would be quicker than manually building each row through VB.NET objects.