Insert Records repeatedly faster

Insert Records repeatedly faster - vb.net

I'm monitoring a folder for Jpg files and need to process the incoming files.
I decode the filename to get all the information I want and insert into a table and then move the file to another folder.
The file name is already contains all the information I want. Eg.
2011--8-27_13:20:45_MyLocation_User1.jpg.
Now I'm using an Insert statement
Private Function InsertToDB(ByVal SourceFile As String, ByVal Date_Time As DateTime, ByVal Loc As String, ByVal User As String) As Boolean
Dim conn As SqlConnection = New SqlConnection(My.Settings.ConString)
Dim sSQL As String = "INSERT INTO StageTbl ...."
Dim cmd As SqlComman
cmd = New SqlCommand(sSQL, conn)
....Parameters Set ...
conn.Open()
cmd.ExecuteNonQuery()
conn.Close()
conn = Nothing
cmd = Nothing
End Function
The function will be called for every single file found.
Is this most efficient way? Looks like its is very slow. I need to process about 20~50 files/sec. Probably a stored procedure?
I need to do this as fast as possible. I guess bulk insert not applicable here.
Please help.

Bulk insert could be applicable here - do you need them to be in the DB instantly, or could you just build up the records in memory then push them into the database in batches?
Are you multi-threading as well - otherwise your end to end process could get behind.
Another solution would be to use message queues - pop a message into the queue for every file, then have a process (on a different thread) that is continually reading the queue and adding to the database.

There are several things you can do to optimize the speed of this process:
Don't open and close the connection for every insert. That alone will yield a (very) significant performance improvement (unless you were using connection pooling already).
You may gain performance if you disable autocommit and perform inserts in blocks, commiting the transaction after every N rows (100-1000 rows is a good number to try for a start).
Some DB systems provide a syntax to allow insertion of multiple rows in a single query. SQL Server doesn't but you may be interested on this: http://blog.sqlauthority.com/2007/06/08/sql-server-insert-multiple-records-using-one-insert-statement-use-of-union-all/
If there are many users/processes accessing this table, access can be slow depending on your transaction isolation level. In your case (20-50 inserts/sec) this shouldn't make a big difference. I don't recommend modifying this unless you understand well what you are doing: http://en.wikipedia.org/wiki/Isolation_%28database_systems%29 and http://technet.microsoft.com/es-es/library/ms173763.aspx .
I don't think a stored procedure will necessarily provide a big performance gain. You are only parsing/planning the insert 20-50 times per second. Use a stored procedure only if it fits well your development model. If all your other queries are in code, you can avoid it.
Ensure your bottleneck is the database (i.e. moving files is not taking a lot of time), but since the OS should be good at this, check the points above. If you find that moving files is your bottleneck, delaying or moving files in the background (another thread) can help to a certain extent.

Related

Copying Tables contents of one Database to another from a vb.NET app using OracleDataAdapter.InsertCommand

So the rundown of what I'm trying to achieve is essentially an update app that will pull data from our most recent production databases and copy it's contents to the Devl or QA databases. I plan to limit what is chosen by a number of rows so to increase the consistency that this update can happen by allowing us to only get what we need, as for right now these databases rarely get updated due to the vast size of the copy job. The actual pl/sql commands will be stored in a table that I plan to reference for each table, but I'm currently stuck on the best and easiest way to transfer these between these two databases while still getting my commandText to be used. I figured the best way would be to use the OracleDataAdapter.InsertCommand command, but very few examples can be found as to what I'm doing, any suggestions aside from the .InsertCommand are welcome as I'm still getting my footing with Oracle all together.
Dim da As OracleDataAdapter = New OracleDataAdapter
Dim cmd As New OracleCommand()
GenericOraLoginProvider.Connect()
' Create the SelectCommand.
cmd = New OracleCommand("SELECT * FROM TAT_TESTTABLE ", GenericOraLoginProvider.Connection())
da.SelectCommand = cmd
' Create the InsertCommand.
cmd = New OracleCommand("INSERT INTO TAT_TEMP_TESTTABLE", GenericOraLoginProvider.Connection())
da.InsertCommand = cmd
Question: This is an example of what I've been trying as a first step with the Insert command, TAT_TESTTABLE and TAT_TEMP_TESTTABLE are just junk tables that I loaded with data to see if I could move things the way I wanted this way.
As why I'm asking this question the data isn't transferring over, while these tables are on the same database in the future they will not be along with the change to the previously mentioned pl/sql commands. Thankyou for any help, or words of wisdom you can provide, and sorry for the wall of text I tried to keep it specific.

lookup sqlbulkcopy. I use this to transfer data between all kinds of vendor databases.. https://msdn.microsoft.com/en-us/library/ex21zs8x(v=vs.110).aspx

Moving from Access backend to SQL Server as be. Efficiency help needed

I am working on developing an application for my company. From the beginning we were planning on having a split DB with an access front end, and storing the back end data on our shared server. However, after doing some research we realized that storing the data in a back end access DB on a shared drive isn’t the best idea for many reasons (vpn is so slow to shared drive from remote offices, access might not be the best with millions of records, etc.). Anyways, we decided to still use the access front end, but host the data on our SQL server.
I have a couple questions about storing data on our SQL server. Right now when I insert a record I do it with something like this:
Private Sub addButton_Click()
Dim rsToRun As DAO.Recordset
Set rsToRun = CurrentDb.OpenRecordset("SELECT * FROM ToRun")
rsToRun.AddNew
rsToRun("MemNum").Value = memNumTextEntry.Value
rsToRun.Update
memNumTextEntry.Value = Null
End Sub
It seems like it is inefficient to have to use a sql statement like SELECT * FROM ToRun and then make a recordset, add to the recordset, and update it. If there are millions of records in ToRun will this take forever to run? Would it be more efficient just to use an insert statement? If so, how do you do it? Our program is still young in development so we can easily make pretty substantial changes. Nobody on my team is an access or SQL expert so any help is really appreciated.

If you're working with SQL Server, use ADO. It handles server access much better than DAO.
If you are inserting data into a SQL Server table, an INSERT statement can have (in SQL 2008) up to 1000 comma-separated VALUES groups. You therefore need only one INSERT for each 1000 records. You can just append additional inserts after the first, and do your entire data transfer through one string:
INSERT INTO ToRun (MemNum) VALUES ('abc'),('def'),...,('xyz');
INSERT INTO ToRun (MemNum) VALUES ('abcd'),('efgh'),...,('wxyz');
...
You can assemble this in a string, then use an ADO Connection.Execute to do the work. It is frequently faster than multiple DAO or ADO .AddNew/.Update pairs. You just need to remember to requery your recordset afterwards if you need it to be populated with your newly-inserted data.

There are actually two questions in your post:
Will OpenRecordset("SELECT * FROM ToRun") immediately load all recordsets?
No. By default, DAO's OpenRecordset opens a server-side cursor, so the data is not retrieved until you actually start to move around the recordset. Still, it's bad practice to select lots of rows if you don't need to. This leads to the next question:
How should I add records in an attached SQL Server database?
There are a few ways to do that (in order of preference):
Use an INSERT statment. That's the most elegant and direct solution: You want to insert something, so you execute INSERT, not SELECT and AddNew. As Monty Wild explained in his answer, ADO is prefered. In particular, ADO allows you to use parameterized commands, which means that you don't have to put-into-quotes-and-escape your strings and correctly format your dates, which is not so easy to do right.
(DAO also allows you to execute INSERT statements (via CurrentDb.Execute), but it does not allow you to use parameters.)
That said, ADO also supports the AddNew syntax familiar to you. This is a bit less elegant but requires less changes to your existing code.
And, finally, your old DAO code will still work. As always: If you think you have a performance problem, measure if you really have one. Clean code is great, but refactoring has a cost and it makes sense to optimize those places first where it really matters. Test, measure... then optimize.

It seems like it is inefficient to have to use a sql statement like SELECT * FROM ToRun and then make a recordset, add to the recordset, and update it. If there are millions of records in ToRun will this take forever to run?
Yes, you do need to load something from the table in order to get your Recordset, but you don't have to load any actual data.
Just add a WHERE clause to the query that doesn't return anything, like this:
Set rsToRun = CurrentDb.OpenRecordset("SELECT * FROM ToRun WHERE 1=0")
Both INSERT statements and Recordsets have their pros and cons.
With INSERTs, you can insert many records with relatively little code, as shown in Monty Wild's answer.
On the other hand, INSERTs in the basic form shown there are prone to SQL Injection and you need to take care of "illegal" characters like ' inside your values, ideally by using parameters.
With a Recordset, you obviously need to type more code to insert a record, as shown in your question.
But in exchange, a Recordset does some of the work for you:
For example, in the line rsToRun("MemNum").Value = memNumTextEntry.Value you don't have to care about:
characters like ' in the input, which would break an INSERT query unless you use parameters
SQL Injection
getting the date format right when inserting date/time values

VB6 Access speed when UPDATE to table

I know VB6 is a bit out of date but that's the application at the moment that I have inherited.
I have to do an update based on the results of an array to an access table.
The array contains a double and the id of the record to update.
The issue is that there are 120,000 records to update but on a test it took 60 seconds just to run it on 374 records.
This is my update code
Dim oCon As New ADODB.Connection
Dim oRs As New ADODB.Recordset
Dim string3 As String
oCon.Open "Driver={Microsoft Access Driver (*.mdb)};Dbq=" & App.Path &"\hhis.mdb;Pwd=/1245;"
oCon.BeginTrans
For i = 0 To maxnumberofrecords - 1
string3 = "UPDATE YourRatings SET yourRatings=" & YourTotalRatingsAll(i) & " Where thisID = " & thisID(i) & ";"
oRs.Open string3, oCon, adOpenStatic, adLockOptimistic
Next i
oCon.CommitTrans
oCon.Close
I added the "CommitTrans" based on some other articles that I had read but that doesn't seem to increase the speed.
The other problem is that I am going to have to run another query to rank the highest(1) to lowest (374) and update the db again...although I can probably do something with the array to add that column at the same time.
This seems quite slow to me especially when other post mention 200000 records in 14 seconds.
Have I just missed something?
Would it be quicker to load from a text file?
Thank you in advance for any help.
Mal

With Open you always constructing a new ResultSet object. Try oCon.execute string3 which only sends the SQL to your database without ResultSet overhead.
Ensure that you have an index on thisID.
Maybe your Access DB sits on a network drive. That could have a large performance impact. Try it local.

Why are you using the creaky old Access Desktop ODBC Driver instead of the Jet 4.0 OLEDB Provider?
Why are you opening a Recordset to do an UPDATE instead of calling Execute on the Connection?
Is there any reason you can't open the database for exclusive access to perform this operation? Locking overhead is all but eliminated and "cargo cult" techniques like thrashing with BeginTrans/CommitTrans lose any significance.
Do you have an index on your thisID field in the table?
Please do move to the magic bullet .Net as soon as possible. Your programs will be even slower but we won't have to read all the whinging blaming VB6.

Just to add to Wumpz answer, you might want to try just adding the query directly into the Access database and calling this using parameters. This SO thread says far more about it.
Parameters are far more stable and less hackable than injection.

insert blob via odp.net through dblink size limitations

i am using ODP.NET (version 2.111.7.0) and C#, OracleCommand & OracleParameter objects and OracleCommand.ExecuteNonQuery method
i was wondering if there is a way to insert a big byte array into an oracle table that resides in another database, through DB link. i know that lob handling through DB links is problematic in general, but i am a bit hesitant to modify code and add another connection.
will creating a stored procedure that takes blob as parameter and talks internally via dblink make any difference? don't think so..
my current situation is that Oracle will give me "ORA-22992: cannot use LOB locators selected from remote tables" whenever the parameter i pass with the OracleCommand is a byte array with length 0, or with length > 32KB (i suspect, because 20KB worked, 35KB didn't)
I am using OracleDbType.Blob for this parameter.
thank you.
any ideas?

i ended up using a second connection, synchronizing the two transactions so that commits and rollbacks are always performed jointly. i also ended up actually believing that there's a general issue with handling BLOBs through a dblink, so a second connection was a better choice, although in my case the design of my application was slightly disrupted - needed to introduce both a second connection and a second transaction. other option was to try and insert the blob in chunks of 32k, using PL/SQL blocks and some form of DBMS_LOB.WRITEAPPEND, but this would require more deep changes in my code (in my case), therefore i opted for the easier and more straightforward first solution.

All of a Sudden , Sql Server Timeout

We got a legacy vb.net applicaction that was working for years
But all of a sudden it stops working yesterday and gives sql server timeout
Most part of application gives time out error , one part for example is below code :
command2 = New SqlCommand("select * from Acc order by AccDate,AccNo,AccSeq", SBSConnection2)
reader2 = command2.ExecuteReader()
If reader2.HasRows() Then
While reader2.Read()
If IndiAccNo <> reader2("AccNo") Then
CAccNo = CAccNo + 1
CAccSeq = 10001
IndiAccNo = reader2("AccNo")
Else
CAccSeq = CAccSeq + 1
End If
command3 = New SqlCommand("update Acc Set AccNo=#NewAccNo,AccSeq=#NewAccSeq where AccNo=#AccNo and AccSeq=#AccSeq", SBSConnection3)
command3.Parameters.Add("#AccNo", SqlDbType.Int).Value = reader2("AccNo")
command3.Parameters.Add("#AccSeq", SqlDbType.Int).Value = reader2("AccSeq")
command3.Parameters.Add("#NewAccNo", SqlDbType.Int).Value = CAccNo
command3.Parameters.Add("#NewAccSeq", SqlDbType.Int).Value = CAccSeq
command3.ExecuteNonQuery()
End While
End If
It was working and now gives time out in command3.ExecuteNonQuery()
Any ideas ?
~~~~~~~~~~~
Some information :
There isnt anything that has been changed on network and the app uses local database
The main issue is that even in development environment it donest work anymore

I'll state the obvious - something changed. It could be an upgrade that isn't having the desired effect - it could be a network component going south - it could be a flakey disk - it could be many things - but something in the access path has changed. What other problem indications are you seeing, including problems not directly related to this application? Where is the database stored (local disk, network storage box, written by angels on the head of a pin, other)? Has your system administrator "helped" or "improved" things somehow? The code has not worn out - something else has happened.

Is it possible that this query has been getting slower over time and is now just exceeded the default timeout?
How many records would be in the acc table and are there indexes on AccNo and AccSeq?
Also what version of SQL are you using?

How long since you updated statistics and rebuilt indexes?
How much has your data grown? Queries that work fine for small datasets can be bad for large ones.
Are you getting locking issues? [AMJ] Have you checked activity monitor to see if there are locks when the timeout occurs?
Have you run profiler to grab the query that is timing out and then run it directly onthe server? Is it faster then? Could also be network issues in moving the information from the database server to the application. That would at least tell you if it s SQl Server issue or a network issue.
And like Bob Jarvis said, what has recently changed on the server? Has something changed in the database structure itself? Has someone added a trigger?

I would suggest that there is a lock on one of the records that you are trying to update, or there are transactions that haven't been completed.
I know this is not part of your question, but after seeing your sample code i have to make this comment: is there any chance you could change your method of executing sql on your database? It is bad on so many levels.

Perhaps should you set the CommandTimeout property to a higher delay?
Doing so will allow your command to wait a little longer for the underlying database to respond. As I see it, perhaps are you not letting time enough for your database engine to perform all what is required before creating another command to perform your update.
Know that the SqlDataReader continues to "SELECT" while feeding the in-memory objects. Then, while reading, you require your code to update some other table, which your DBE just can't handle, by the time your SqlCommand requires, than times out.

any chances of a "quotes" as part of the strings you are passing to queries?
any chances of date dependent queries where a special condition is not working anymore?

Have you tested the obvious?
Have you run the "update Acc Set AccNo=#NewAccNo,AccSeq=#NewAccSeq where AccNo=#AccNo and AccSeq=#AccSeq" query directly on your SQL Server Management Studio? (Please replace the variables with some hard coded values)
Have you run the same test on another colleague's PC?

Can we make sure that the SQLConnection is working fine. It could be the case that SQL login criteria is changed and connection is getting a timeout. It will be probably more helpful if you post the error message here.

You can rewrite the update as a single query. This will run much faster than the original query.
UPDATE subquery
SET AccNo = NewAccNo, AccSeq = NewAccSeq
FROM
(SELECT AccNo, AccSeq,
DENSE_RANK() OVER (PARTITION BY AccNo ORDER BY AccNo) NewAccNo,
ROW_NUMBER() OVER (PARTITION BY AccNo ORDER BY AccDate, AccSeq)
+ 10000 NewAccSeq
FROM Acc) subquery

After HLGEM's suggestions, I would check the data and make sure it is okay. In cases like this, 95% of the time it is the data.

Make sure disk is defragged. Yes, I know, but it does make a difference. Not the built-in defragger. One that defrags and optimizes like PerfectDisk.

This may be a bit of a long shot, but if your entire application has stopped working, have you run out of space for the transaction log in your database? Either it's been specified to an absolute size, and that has been reached, or your disk is just full.

May be your tables include more information, and defined SqlConnection.ConnectionTimeout property value in config file with little value. And this value isn't necessary to execute your queries.
you can trying optimize your queries, and also rebuilt indexes.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas