What does ExecuteNonQuery() do during a bulk insert? - vb.net

I have some bulk insert vb.net code (working) that I have written. It calls ExecuteNonQuery() for each insert and then at the end does a commit().
My question is on where these inserts are placed, while waiting for the commit() command? I have not made any changes to support batching as yet. So with my existing code a million rows will be inserted before calling commit(). I ask this question obviously to know if I will run into memory issues, hence forcing me to makes changes to my code now.

In the normal rollback journal mode, changes are simply written to the database. However, to allow atomic commits, the previous contents of all changed database pages are written to the rollback journal so that a rollback can restore the previous state.
(When you do so many inserts that new pages need to be allocated, there is no old state for those pages.)
In WAL mode, all changes are written to the write-ahead log.
In either case, nothing is actually written until the amount of data overflows the page cache (which has a size of about 2 MB by default).
So the size of a transaction is not limited by memory, only by disk space.

in bulk insert query, command.ExecuteNonQuery() returns no of rows effected by insert,update or delete statement.
In your case after each successful insert , it will return 1 as integer.
if you are not using a transaction query,Commit doesn't make any sense. If you're not explicitly using a transaction, changes are committed automatically.

Related

Delete and Insert Inside one Transaction SQL

I just want to ask if it is always the first query will be executed when encapsulate to a transaction? for example i got 500 k records to be deleted and 500 k to be inserted, is there a possibility of locking?
Actually I already test this query and it works fine but i want to make sure if my assumption is correct.
Note: this will Delete and Insert the same record with possible update on other columns.
BEGIN TRAN;
DELETE FROM OUTPUT TABLE WHERE ID = (1,2,3,4 etc)
INSERT INTO OUTPUT TABLE Values (1,2,3,4 etc)
COMMIT TRAN;
Within a transaction all write locks (all locks acquired for modifications) must obey the strict two phase locking rule. One of the consequences is that a write (X) lock acquired in a transaction cannot be released until the transaction commits. So yes, the DELETE and INSERT will execute sequentially and all locks acquired during the DELETE will be retained while executing the INSERT.
Keep in mind that deleting 500k rows in a transaction will escalate the locks to one table lock, see Lock Escalation.
Deleting 500k rows and inserting 500k rows in a single transaction, while maybe correct, is a bad idea. You should avoid such large units of works, long transaction, if possible. Long transactions pin the log in place, create blocking and contention, increase recovery and DB startup time, increase SQL Server resource consumption (locks require memory).
You should consider doing the operation in small batches (perhaps 10000 rows at time), use MERGE instead of DELETE/INSERT (if possible) and, last but not least, consider a partitioned sliding window
implementation, see How to Implement an Automatic Sliding Window in a Partitioned Table.
From the documentation on TRANSACTION (emphasis mine):
BEGIN TRANSACTION represents a point at which the data referenced by a
connection is logically and physically consistent. If errors are
encountered, all data modifications made after the BEGIN TRANSACTION
can be rolled back to return the data to this known state of
consistency. Each transaction lasts until either it completes without
errors and COMMIT TRANSACTION is issued to make the modifications a
permanent part of the database, or errors are encountered and all
modifications are erased with a ROLLBACK TRANSACTION statement.
BEGIN TRANSACTION starts a local transaction for the connection
issuing the statement. Depending on the current transaction isolation
level settings, many resources acquired to support the Transact-SQL
statements issued by the connection are locked by the transaction
until it is completed with either a COMMIT TRANSACTION or ROLLBACK
TRANSACTION statement. Transactions left outstanding for long periods
of time can prevent other users from accessing these locked resources,
and also can prevent log truncation.
Although BEGIN TRANSACTION starts a local transaction, it is not
recorded in the transaction log until the application subsequently
performs an action that must be recorded in the log, such as executing
an INSERT, UPDATE, or DELETE statement. An application can perform
actions such as acquiring locks to protect the transaction isolation
level of SELECT statements, but nothing is recorded in the log until
the application performs a modification action.

batch procedure, when to commit transactions?

I'm pretty new to PL-SQL although I've got lots of db experience with other RDBMS's. Here's my current issue.
procedure CreateWorkUnit
is
update workunit
set workunitstatus = 2 --workunit loaded
where
SYSDATE between START_DATE and END_DATE
and workunitstatus = 1 --workunit created;
--commit here?
call loader; --loads records based on status, will have a commit of its own
update workunit wu
set workunititemcount = (select count(*) from workunititems wui where wui.wuid = wu.wuid)
where workunitstatus = 2
So the behaviour I'm seeing, with or without commit statements is that I have to execute twice. Once to flip the statuses, then the loader will run on the second execution. I'd like it all to run in one go.
I'd appreciate any words of oracle wisdom.
Thanks!
When to commit transactions in a batch procedure? It is a good question, although it only seems vaguely related to the problems with the code you post. But let's answer it anyway.
We need to commit when the PL/SQL procedure has completed a unit of work. A unit of work is a business transaction. This would normally be at the end of the program, the last statement before the EXCEPTION section.
Sometimes not even then. The decision to commit or rollback properly lies with the top of the calling stack. If our PL/SQL is being called from a client (may a user clicking a button on a screen) then perhaps the client should issue the commit.
But it is not unreasonable for a batch process to manage its own commit (and rollback in the case of errors). But the main point is that the only the toppermost procedure should issue COMMIT. If a procedure calls other procedures those called programs should not issue commits or rollbacks. If they should handle any errors (log etc) and re-raise them to the calling program. Let it decode whether to rollback. Because all the called procedures run in the same session and hence the same transaction: a rollback in a called program will revert all the changes in the batch process. That's not right. The same reasoning applies to commits.
You will sometimes read advice on using intermittent commits to break up long running processes into smaller units e.g. every 1000 inserts. This is bad advice for several reasons, not all of them related to transactions. The pertinent ones are:
Issuing a commit frees locks on resources. This is the cause of ORA-1555 Snapshot too old errors.
It also affects read consistency, which only applies at the statement and/or transaction level. This is the cause of ORA-1002 Fetch out of sequence errors.
It affects re-startability. If the program fails having processed 30% of the records, can we be confident it will only process the remaining 70% when we re-run the batch?
Once we commit records other sessions can see those changes: does it make sense for other users to see a partially changed view of the data?
So, the words of "Oracle wisdom" are: always align the database transaction with the business transaction, with a single commit per unit of work.
Somebody mentioned autonmous transactions as a way of issuing commits in sub-processes. This is usually a bad idea. Changes made in an autonomous transaction are visible to other sessions but not to our own. That very rarely makes sense. It also creates the same problems with re-startability which I discussed earlier.
The only acceptable use for automomous transactions is recording activity (error log, trace, audit records). We need that data to persist regardless of what happens in the wider transaction. Any other use of the pragma is almost certainly a workaround for a porr design, which actually just makes the problem worse.
You may not need to commit in pl/sql procedure. the procedures that you call inside another procedure will use same session so you don't need to commit. by the way procedure must completely rollback if it session rollbacked or has an exception.
I mis-classfied my problem. I thought this was a transaction problem and really it was one of my flags not being set as expected.A number field was null when I was expecting 0.
Sorry for that.
Josh Robinson

When is a row actually inserted into DB?

When is a row actually inserted into the database? Is it when "INSERT" statement is finished? or when "COMMIT" statement is finished after "INSERT" statement?
Later than you think. The principles here apply generally.
The whole point of the transaction log is to ensure ACID works in case of a power failure just as the INSERT finishes. The INSERT will be rolled forward or rolled back as part of the recovery phase (in most RDBMS)
So, it's more important that the transaction log entry is acknowledged as stored on the media. Then the INSERT can commit.
The data page containing the changed row will end up on disk eventually (checkpoint etc) but not necessarily at the point of successful commit.
However, the data page is in memory and available for use.
Note, an INSERT could cause a page split, indexes to be updated, triggers to fire etc so what I've said is simplified.
And it doesn't matter one way or the other when the data ends up on disk: as long as I can get the data and it's safe in case of, say, power failure
An oldie but still relevant for SQL Server: SQL Server 2000 I/O Basics
And what I've summarized is Write Ahead Logging
If you are running inside a transaction, when the transaction is committed. Otherwise, immediately.
Depends on the database/table implementation. It might just be when the transaction log is integrated - until which time the row is only inserted in the transaction log, and in memory.

Update Zero Rows and then Committing?

Which procedure is more performant for an update which affects zero rows?
UPDATE table SET column = value WHERE id = number;
IF SQL%Rowcount > 0 THEN
COMMIT;
END IF;
or
UPDATE table SET column = value WHERE id = number;
COMMIT;
In other words if an Update affect ZERO rows and a commit is issued am I incurring any added expense at all?
I have a system which is being hampered by log file sync waits... and I'm wondering if issuing a commit; against a transaction which affects zero rows will write that statement to the log or not and thus cause more contention on LGWR.
COMMIT does force the log file sync so the system will have to wait indeed.
However, ROLLBACK does too and at some time either of them will have to happen.
So if you issue neither COMMIT nor ROLLBACK, you are just staying with an open transaction which sooner or later will cause a log sync wait.
Probably, you want to batch you UPDATE operations rather than waiting for a first successful update and committing it.
There are risks in this. Technically while the UPDATE may affect zero rows, it can fire before or after update triggers on the table (not at row level). Those triggers could potentially "do something" that requires a commit/rollback.
Safer to check to see if LOCAL_TRANSACTION_ID is set.
There are any number of reasons which can underlie waits for log file sync. It seems unlikely that the main culprit is committing SQL statements which have updated zero rows. It is true that issuing too many commits can be the cause of this problem. For instance, if the application is set up to commit after every statement (e.g. by using AUTOCOMMIT=TRUE) instead of designing proper transactions. If this is the cause then there is not much you can do, short of a major rewrite of the application.
If you want to delve deeper into the root causes of your problem I recommend you read this exhaustive (and exhausting) article by Pythian's Riyaj Shamsudeen on Tuning ‘log file sync’ Event Waits.

Update a newly created row before final commit

insert into XYZ(col1, col2) values (1,2)
update XYZ set ... where col1 = 1
COMMIT
As in can see in the above code, we havent yet commited our insert statement, and we performed an update operation on the same row, and finally we commit the whole batch.
What exactly would happen in this case? Are there any chances of losing data in this scenario?
your session is always able to see its own modifications, even before you issue a commit.
the newly inserted row would by updated.
The only way you can "lose data" would be an interruption before the commit, in which case no operations would happen at all
The important words in Vincent's response are "your session".
A separate session will only see the unmodified data until you commit. That's part of read consistency means.
Depending on the frameworks and tools you're using, your session may get a lock on the record when you perform the update, preventing other sessions from updating it until you commit or rollback.
For further reading, here is a link to the "Data Concurrency and Consistency" section of the excellent Oracle Concepts Guide 10gR2
http://download.oracle.com/docs/cd/B19306_01/server.102/b14220/consist.htm
Infact All transactions are stored in Rollback Segmant with in Table space memory of that particular instance.. A Rollback segment is a storage space within a table space that holds transaction information used to guarantee data integrity during a ROLLBACK and used to provide read consistency across multiple transactions.