I have a customer that wants to import his sub-customers pricetools (more that 2.000.000 records) every day into a SQL Server database (and yeah....there are more than 900.000 rows of changes every day).
The data is provided in CSV format (not in RFC-4180 standard ç_ç, but nvm) and can be an Insert, Delete or Update of data.
My problem is that the insert of the data inside the database take more than 1 night to end and I need to speed it up.
What I'm doing at the moment is:
Cast the csv file into a Datatable (Tab1) (~3 minutes)
Select all data inside the previous table (Tab0) and match them with the Tab1 (~15 minutes, the unchanged rows are flagged as unmodified, so they are ignored in the adapter.Update, I check that thing for the first rows and seems that it works, I use dataRowToProcess.AcceptChanges() to achieve that).
Launch the following command to apply the changes (More than 5 hours for 900.000 changes):
cmdSQL = New SqlCommand(superQuery, cn)
Dim adapter As SqlDataAdapter = New SqlDataAdapter(cmdSQL)
adapter.MissingSchemaAction = MissingSchemaAction.AddWithKey
Dim build As SqlCommandBuilder = New SqlCommandBuilder(adapter)
build.SetAllValues = False
adapter.Update(dataTableCustomersDetail) 'Insert/Update records
If I have many inserts the process, it is slower than the same amount of updates.
What am I doing wrong? Am I missing some SqlDataAdapter option?
Thanks
Thanks to #matsnow i figuredOut a solution with SqlBulkCopy. Considering that half of the table change everitime and that is a static anag i decide that a Delete/Insert of the data is the fastest way to follow (Now it takes 5-6 Minutes instead of 10).
Code:
'Delete all table content
Dim cmd As SqlCommand = New SqlCommand("TRUNCATE TABLE " + tableName, cn)
cmd.ExecuteNonQuery()
'Insert all records
Using sbc As SqlBulkCopy = New SqlBulkCopy(cn)
sbc.DestinationTableName = tableName
sbc.BulkCopyTimeout = 1000
For Each column As DataColumn In dataTableCustomersDetail.Columns
sbc.ColumnMappings.Add(column.ToString(), column.ToString())
Next
sbc.WriteToServer(dataTableCustomersDetail)
End Using
Use Connection.BeginTransaction() to speed up the DataAdapter update.
cn.Open() 'open connection
Dim myTrans As SQLTransaction
myTrans = cn.BeginTransaction()
'Associate the transaction with the select command object of the DataAdapter
adapter.SelectCommand.Transaction = myTrans
adapter.Update(dataTableCustomersDetail) 'do the update as before
Try
myTrans.Commit()
Catch ex As Exception
myTrans.Rollback()
End Try
cn.Close()
With 8000 rows this changes the update time from over 5 minutes to 2 seconds
Related
I have a Public Sub to move a collection of records from one table to another in the same SQLite database. First it reads a record from strFromTable, then writes it to strToTable, then deletes the record from strFromTable. To speed things up, I've loaded the entire collection of records into a transaction. When the list involves moving a lot of image blobs, the db gets backed up, and throws the exception "The Database is Locked". I think what is happening is that it's not finished writing one record before it starts trying to write the next record. Since SQLite only allows one write at a time, it thows the "Locked" exception.
Here is the code that triggers the error when moving a lot of image blobs:
Using SQLconnect = New SQLiteConnection(strDbConnectionString)
SQLconnect.Open()
Using tr = SQLconnect.BeginTransaction()
Using SQLcommand = SQLconnect.CreateCommand
For Each itm As ListViewItem In lvcollection
SQLcommand.CommandText = $"INSERT INTO {strToTable} SELECT * FROM {strFromTable} WHERE id = {itm.Tag}; DELETE FROM {strFromTable} WHERE ID = {itm.Tag};"
SQLcommand.ExecuteNonQuery()
Next
End Using
tr.Commit()
End Using
End Using
When I get rid of the transaction, it executes without error:
Using SQLconnect = New SQLiteConnection(strDbConnectionString)
SQLconnect.Open()
Using SQLcommand = SQLconnect.CreateCommand
For Each itm As ListViewItem In lvcollection
SQLcommand.CommandText = $"INSERT INTO {strToTable} SELECT * FROM {strFromTable} WHERE id = {itm.Tag}; DELETE FROM {strFromTable} WHERE ID = {itm.Tag};"
SQLcommand.ExecuteNonQuery()
Next
End Using
End Using
I'm not very good with DB operations, so I'm sure there is something that needs improvement. Is there a way to make SQLite completely finish the previous INSERT before executing the next INSERT? How can I change my code to allow using a transaction?
Thank you for your help.
.
Ok ... here is the solution that I decided to go with. I hope this helps someone finding this in a search:
Dim arrIds(lvcollection.Count - 1) As String
Dim i as Integer = 0
' Load the array with all the Tags in the listViewCollection
For i = 0 to lvcollection.Count - 1
arrIds(i) = lvcollection(i).Tag 'item.Tag holds the Primary Key "id" field in the DB
Next
'build a comma-space separated string of all ids from the array of ids.
Dim strIds as String = String.Join(", ", arrIds)
Using SQLconnect = New SQLiteConnection(strDbConnectionString)
SQLconnect.Open()
Using tr = SQLconnect.BeginTransaction()
Using SQLcommand = SQLconnect.CreateCommand
SQLcommand.CommandText = $"INSERT INTO {strToTable} SELECT * FROM {strFromTable} WHERE id IN ({strIds});"
SQLcommand.ExecuteNonQuery()
SQLcommand.CommandText = $"DELETE FROM {strFromTable} WHERE ID IN ({strIds});"
SQLcommand.ExecuteNonQuery()
End Using
tr.Commit()
End Using
End Using
The IN statement allows me to pass all of the "id" values to be deleted as a batch. This solution is faster and more secure than doing them one by one with no transaction.
Thanks for the comments, and best wishes to everyone in their coding.
net and would to have the Header Text of columns in a datagridview be named after results from the database, e.g the query in my code returns four dates,30/08/2017,04/09/2017,21/09/2017 and 03/02/2018. My aim is to have the column headers in the data grid named after those dates. Your help will highly be appreciated.
sql = "SELECT COUNT (ServiceDate) As NoOfServiceDates FROM (SELECT DISTINCT ServiceDate FROM tblattendance)"
Using command = New OleDbCommand(sql, connection)
Using reader = command.ExecuteReader
reader.Read()
ColumnNo = CInt(reader("NoOfServiceDates")).ToString
End Using
End Using
DataGridView1.ColumnCount = ColumnNo
For i = 0 To DataGridView1.Columns.Count - 1
sql = "SELECT DISTINCT ServiceDate FROM tblattendance"
Using command = New OleDbCommand(sql, connection)
Using reader = command.ExecuteReader
While reader.Read
DataGridView1.Columns(i).HeaderText = reader("ServiceDate").ToString
End While
End Using
End Using
Next
The current code re-runs the query each time through the column count loop, meaning it will set the column header for that column to all of the date values in sequence, so the last value in the query shows in the all the columns. You only need to run the query once:
Dim i As Integer = 0
sql = "SELECT DISTINCT ServiceDate FROM tblattendance"
Using command As New OleDbCommand(sql, connection), _
reader As OleDbDatareader = command.ExecuteReader()
While reader.Read
DataGridView1.Columns(i).HeaderText = reader("ServiceDate").ToString
i+= 1
End While
End Using
Additionally, this still results in two separate trips to the database, where you go once to get the count and again to get the values. Not only is this very bad for performance, it leaves you open to a bug where another user changes your data from one query to the next.
There are several ways you can get this down to one trip to the database: loading the results into memory via a List or DataTable, changing the SQL to include the count and the values together, or adding a new column each time through the list. Here's an example using the last option:
DataGridView1.Columns.Clear()
Dim sql As String = "SELECT DISTINCT ServiceDate FROM tblattendance"
Using connection As New OleDbConnection("string here"), _
command As New OleDbCommand(sql, connection)
connection.Open()
Using reader As OleDbDataReader = command.ExecuteReader()
While reader.Read
Dim column As String = reader("ServiceDate").ToString()
DataGridView1.Columns.Add(column, column)
End While
End Using
End Using
Even better if you can use something like Sql Server's PIVOT keyword in combination with the DataGridView's AutoGenerateColumns feature for DataBinding, where you will write ONE SQL statement that has both column info and data, and simply bind the result set to the grid.
The For Next is incorrect. You execute your command for every column, when you only need to execute it once. The last result from the DataReader will be the header for every column as currently written.
You should iterate through your DataReader and increment the cursor variable there:
Dim i As Integer = 0
Using command = New OleDbCommand(sql, connection)
Using reader = command.ExecuteReader
While reader.Read
DataGridView1.Columns(i).HeaderText = reader("ServiceDate").ToString
i += 1
End While
End Using
End Using
I am a little new to using vb.net and SQL so I figured I would check with you guys to see if what I am doing makes sense, or if there is a better way. For the first step I need to read in all the rows from a couple of tables and store the data in the way the code needs to see it. First I get a count:
mysqlCommand = New SQLCommand("SELECT COUNT(*) From TableName")
Try
SQLConnection.Open()
count = myCommand.ExecuteScalar()
Catch ex As SqlException
Finally
SQLConnection.Close()
End Try
Next
Now I just want to iterate through the rows, but I am having a hard time with two parts, First, I cannot figure out the SELECT statement that will jet me grab a particular row of the table. I saw the example here, How to select the nth row in a SQL database table?. However, this was how to do it in SQL only, but I was not sure how well that would translate over to a vb.net call.
Second, in the above mycommand.ExecuteScalar() tell VB that we expect a number back from this. I believe the select statement will return a DataRow, but I do not know which Execute() statement tells the script to expect that.
Thank you in advance.
A simple approach is using a DataTable which you iterate row by row. You can use a DataAdapter to fill it. Use the Using-statement to dispose/close objects property that implement IDisposable like the connection:
Dim table = New DataTable
Using sqlConnection = New SqlConnection("ConnectionString")
Using da = New SqlDataAdapter("SELECT Column1, Column2, ColumnX FROM TableName ORDER By Column1", sqlConnection)
' you dont need to open/close the connection with a DataAdapter '
da.Fill(table)
End Using
End Using
Now you can iterate all rows with a loop:
For Each row As DataRow In table.Rows
Dim col1 As Int32 = row.Field(Of Int32)(0)
Dim col2 As String = row.Field(Of String)("Column1")
' ...'
Next
or use the table as DataSource for a databound control.
I need to read row by row in a column in a table then I need to store this then call procedure to insert data to a different column using vb.net.
I have already create the DB connection and I know how to call the procedure
but I'm not sure of how to read in the loop and then to assign it to a variable to call it in the store procedure.
Dim drDocs As SqlClient.SqlDataReader
Dim cmdDocs As SqlClient.SqlCommand
Dim Doc As Long
Dim l As Long
Using conn As New SqlConnection(DBpath)
cmdDocs = New SqlClient.SqlCommand("Select (RecordID) from DocID", conn)
drDocs = cmdDocs.ExecuteReader
Do While drDocs.Read
'need it read each row in that field and hold value'
Loop
drDocs.Close()
cmdDocs.Dispose()
If Doc Then
cmdDocs = New SqlClient.SqlCommand("Insert_Doc", conn)
cmdDocs.CommandType = CommandType.StoredProcedure
cmdDocs.Parameters.Add("path", SqlDbType.NVarChar).Value =need to put hold value from reading that cloumn row by row
End If
End If
The code you've provided actually works now. It is as Juergen D says, sql functions like Max(), min() and using Limit will only return 1/certain number of rows based on their conditions.
if I may, just use this SQL command
"select `RecordID` from DocID asc;"
If you want it in descending format, use desc instead
...now reading further, I realize that what you want to do is to store the results, then loop again through it so that you can do an sql command with it, correct? what you can do then is to pass the SQL results to a container (I use datagridviews) then loop through the container.
I'm having some trouble updating changes I made to a datatable via a dataadapter. I am getting "Concurrency violation: the UpdateCommand affected 0 of 10 rows"
'Get data
Dim Docs_DistributedTable As New DataTable("Docs_Distributed")
Dim sql = "SELECT DISTINCT CompanyID, SortKey, OutputFileID, SequenceNo, DeliveredDate, IsDeliveryCodeCounted, USPS_Scanned FROM Docs_Distributed_Test"
Using sqlCmd As New SqlCommand(sql, conn)
sqlCmd.CommandType = CommandType.Text
Docs_DistributedTable.Load(sqlCmd.ExecuteReader)
End Using
'Make various updates to some records in DataTable.
'Update the Database
Dim sql As String = "UPDATE Docs_Distributed "
sql += "SET DeliveredDate = #DeliveredDate "
sql += "WHERE SequenceNo = #SequenceNo"
Using transaction As SqlTransaction = conn.BeginTransaction("ProcessConfirm")
Try
Using da As New SqlDataAdapter
da.UpdateCommand = conn.CreateCommand()
da.UpdateCommand.Transaction = transaction
da.UpdateCommand.CommandText = sql
da.UpdateCommand.Parameters.Add("#DeliveredDate", SqlDbType.DateTime).SourceColumn = "DeliveredDate"
da.UpdateCommand.Parameters.Add("#SequenceNo", SqlDbType.Int).SourceColumn = "SequenceNo"
da.ContinueUpdateOnError = False
da.Update(Docs_DistributedTable)
End Using
transaction.Commit()
Catch ex As Exception
transaction.Rollback()
End Try
End Using
Now here's the catch. I am selecting DISTINCT records and essentially getting one row per SequenceNo. There may be many rows with the same SequenceNo, and I am hoping this will update them all. I'm not sure if this is related to my problem or not.
Your select is from "Docs_Distributed_Test" and your update is to "Docs_Distributed" - this may be the cause of your issue. Are the sequence ID's the same? (If not then perhaps it is indeed affecting 0 rows with it's update).
Other than that, you can always disable optimistic concurrency on your table-adapter and it will no longer enforce the validation (Though in this case that would likely result in no error but not updating any rows).
I don't understand the Microsoft-specific aspects of this, plus VB is often hard to follow. But this sequence seems suspect:
Using transaction As SqlTransaction = conn.BeginTransaction("ProcessConfirm")
Try
Using da As New SqlDataAdapter
da.UpdateCommand = conn.CreateCommand()
da.UpdateCommand.Transaction = transaction
conn.BeginTransaction is followed by conn.CreateCommand(). Isn't that a) useless, b) hazardous to the connection state, or c) potentially a race condition?