Is there a difference between a SELECT statement inside a transaction and one that is outside of it? - sql

Does the default READ COMMITTED isolation level somehow makes the SELECT statement act different inside of a transaction than one that is not in a transaction?
I am using MS SQL.

Yes, the one inside the transaction can see changes made by other previous Insert/Update/delete statements in that transaction; a Select statement outside the transaction cannot.
If all you are asking about is what the Isolation Level does, then understand that all Select statements (hey, all statements of any kind) - are in a transaction. The only difference between one that is explicitly in a transaction and one that is standing on its own is that the one that is standing alone starts its transaction immediately before it executes it, and commits or roll back immediately after it executes;
whereas the one that is explicitly in a transaction can (because it has a Begin Transaction statement) can have other statements (inserts/updates/deletes, whatever) occurring within that same transaction, either before or after that Select statement.
So whatever the isolation level is set to, both selects (inside or outside an explicit transaction) will nevertheless be in a transaction which is operating at that isolation level.
Addition:
The following is for SQL Server, but all databases MUST work in the same way. In SQL Server the Query Processor is always in one of 3 Transaction Modes, AutoCommit, Implicit, or Explicit.
AutoCommit is the default transaction management mode of the SQL Server Database Engine. .. Every Transact-SQL statement is committed or rolled back when it completes. ... If a statement completes successfully, it is committed; if it encounters any error, it is rolled back. This is the default, and is the answer to #Alex's question in the comments.
In Implicit Transaction mode, "... the SQL Server Database Engine automatically starts a new transaction after the current transaction is committed or rolled back. You do nothing to delineate the start of a transaction; you only commit or roll back each transaction. Implicit transaction mode generates a continuous chain of transactions. ..." Note that the italicized snippet is for each transaction, whether it be a single or multiple statement transaction.
The engine is placed in Explicit Transaction mode when you explicitly initiate a transaction with BEGIN TRANSACTION Statement. Then, every statement is executed within that transaction until you explicitly terminate the transaction (with COMMIT or ROLLBACK) or if a failure occurs that causes the engine to terminate and Rollback.

Yes, there is a bit of a difference. For MySQL, the database doesn't actually start with a snapshot until your first query. Therefore, it's not begin that matters, but the first statement within the transaction. If I do the following:
#Session 1
begin; select * from table;
#Session 2
delete * from table; #implicit autocommit
#Session 1
select * from table;
Then I'll get the same thing in session one both times (the information that was in the table before I deleted it). When I end session one's transaction (commit, begin, or rollback) and check again from that session, the table will show as empty.

The READ COMMITTED isolation level is about the records that have been written. It has nothing to do with whether or not this select statement is in a transaction (except for those things written during that same transaction).

If your database (or in mysql, the underlying storage engine of all tables used in your select statement) is transactional, then there simply no way to execute it "outside of a transaction".
Perhaps you meant "run it in autocommit mode", but that is not the same as "not transactional". In the latter case, it still runs in a transaction, it's just that the transaction ends immediately after your statement is finshed.
So, in both cases, during the run, a single select statement will be isolated at the READ COMMITTED level from the other transactions.
Now what this means for your READ COMMITTED transaction isolation level: perhaps surprisingly, not that much.
READ COMMITTED means that you may encounter non-repeatable reads: when running multiple select statements in the same transaction, it is possible that rows that you selected at a certain point in time are modified and comitted by another transaction. You will be able to see those changes when you re-execute the select statement later on in the same pending transaction. In autocommit mode, those 2 select statements would be executed in their own transaction. If another transaction would have modified and committed the rows you selected the first time, you would be able to see those changes just as well when you executed the statement the second time.

Related

Delete and Insert Inside one Transaction SQL

I just want to ask if it is always the first query will be executed when encapsulate to a transaction? for example i got 500 k records to be deleted and 500 k to be inserted, is there a possibility of locking?
Actually I already test this query and it works fine but i want to make sure if my assumption is correct.
Note: this will Delete and Insert the same record with possible update on other columns.
BEGIN TRAN;
DELETE FROM OUTPUT TABLE WHERE ID = (1,2,3,4 etc)
INSERT INTO OUTPUT TABLE Values (1,2,3,4 etc)
COMMIT TRAN;
Within a transaction all write locks (all locks acquired for modifications) must obey the strict two phase locking rule. One of the consequences is that a write (X) lock acquired in a transaction cannot be released until the transaction commits. So yes, the DELETE and INSERT will execute sequentially and all locks acquired during the DELETE will be retained while executing the INSERT.
Keep in mind that deleting 500k rows in a transaction will escalate the locks to one table lock, see Lock Escalation.
Deleting 500k rows and inserting 500k rows in a single transaction, while maybe correct, is a bad idea. You should avoid such large units of works, long transaction, if possible. Long transactions pin the log in place, create blocking and contention, increase recovery and DB startup time, increase SQL Server resource consumption (locks require memory).
You should consider doing the operation in small batches (perhaps 10000 rows at time), use MERGE instead of DELETE/INSERT (if possible) and, last but not least, consider a partitioned sliding window
implementation, see How to Implement an Automatic Sliding Window in a Partitioned Table.
From the documentation on TRANSACTION (emphasis mine):
BEGIN TRANSACTION represents a point at which the data referenced by a
connection is logically and physically consistent. If errors are
encountered, all data modifications made after the BEGIN TRANSACTION
can be rolled back to return the data to this known state of
consistency. Each transaction lasts until either it completes without
errors and COMMIT TRANSACTION is issued to make the modifications a
permanent part of the database, or errors are encountered and all
modifications are erased with a ROLLBACK TRANSACTION statement.
BEGIN TRANSACTION starts a local transaction for the connection
issuing the statement. Depending on the current transaction isolation
level settings, many resources acquired to support the Transact-SQL
statements issued by the connection are locked by the transaction
until it is completed with either a COMMIT TRANSACTION or ROLLBACK
TRANSACTION statement. Transactions left outstanding for long periods
of time can prevent other users from accessing these locked resources,
and also can prevent log truncation.
Although BEGIN TRANSACTION starts a local transaction, it is not
recorded in the transaction log until the application subsequently
performs an action that must be recorded in the log, such as executing
an INSERT, UPDATE, or DELETE statement. An application can perform
actions such as acquiring locks to protect the transaction isolation
level of SELECT statements, but nothing is recorded in the log until
the application performs a modification action.

Why use Implicit and Explicit Transaction Modes in SQL Server?

Reading MS documentation on different transaction modes in SQL Server.
Autocommit mode does everything the Implicit and Explicit Transaction mode does with less code, so why should I use Implicit and Explicit Transaction modes in my code ?
Autocommit transaction is only for single query. If you need transaction involving multiple queries, you must use Implicit and Explicit Transaction mode.
As you know sqlserver automatically done the job of transaction commit. But some time we need to commit /rollback on particular condition/logic/business rule(s).
For example, we have one master table and 3 child/details table or say 1 or more child tables.
Suppose, we must save master table entry along with all details table with reference of master table's pk-id. In any case anything an issue then revert whole thing.
So in this scenerio we need to use explicit transaction to commit or rollback as a unit of work. We can use try..catch block for error handling and rollback the transaction.
If we not used this transaction, then after each insert statement sqlserver auto-commit the inserted row and not rollback ever.

When do I have to commit?

I heard in SQL I do not have to commit every statement. Perhaps create I don't have to.
So can you answer me which Statements I have to commit?
I read, that I have to commit all transactions, but I don't know what this is and can't find it anywhere.
Thanks for your help.
Per the SQL standard, most statements that require a transaction will automatically open one.
Some database engines, such as SQL Server, will (by default) automatically commit the transaction if the statement completes successfully. See Autocommit Transactions.
Autocommit mode is the default transaction management mode of the SQL Server Database Engine. Every Transact-SQL statement is committed or rolled back when it completes
SQL Server also has an Implicit Conversions mode which will leave the transaction open until it's explicitly commited.
When operating in this second such mode (which is the default, I believe, for Oracle), or if you've explicitly created a transaction, it's up to you as a developer when to commit the transaction. It should be when you've accomplished a "complete" set of operations against the database.
If you BEGIN a transaction then you have to either ROLLBACK or COMMIT
Example:
BEGIN TRAN
--Your code
INSERT INTO
NewTable
SELECT *
FROM TABLE
COMMIT TRAN
If you do not use that, it is committed upon execution. So the follow will either fail or be committed:
INSERT INTO
NewTable
SELECT *
FROM Table
If there is an error (like there is no NewTable in the DB) the execution will raise an error and the transaction will roll back. If there is no error the transaction will be committed.

Management Studio 2005: Will Cancelling a Statement trigger a Rollback?

A few minutes ago, while working out a new sproc, I executed the wrong delete statement. Something like this:
Delete From SomeTable Where SomeStatusID=1
10 seconds into it, I realized that I typed in the wrong status and hit cancel. It said that the statement was canceled.
I did a restore to a separate database to get back the table that I just presumably nuked, thinking that since this was not in a transaction some records were probably deleted.
Oddly, the records were all intact. Just curious as to why this was -- did it treat the individual delete statement as a transaction in this case, even though there was not an explicit transaction defined?
By default, a single DML statement is executed as a transaction. If the statement succeeds, the transaction is committed. If the statement is cancelled or otherwise fails, the transaction is rolled back.
This behavior is built in to the engine, and is not related to management studio. See Autocommit Transactions in the docs.
There are always transactions. The only thing that changes is how those transactions are isolated from each other and that in management studio the transaction might not be defined explicitly and is auto-committed when the query finishes.

Should I commit or rollback a read transaction?

I have a read query that I execute within a transaction so that I can specify the isolation level. Once the query is complete, what should I do?
Commit the transaction
Rollback the transaction
Do nothing (which will cause the transaction to be rolled back at the end of the using block)
What are the implications of doing each?
using (IDbConnection connection = ConnectionFactory.CreateConnection())
{
using (IDbTransaction transaction = connection.BeginTransaction(IsolationLevel.ReadUncommitted))
{
using (IDbCommand command = connection.CreateCommand())
{
command.Transaction = transaction;
command.CommandText = "SELECT * FROM SomeTable";
using (IDataReader reader = command.ExecuteReader())
{
// Read the results
}
}
// To commit, or not to commit?
}
}
EDIT: The question is not if a transaction should be used or if there are other ways to set the transaction level. The question is if it makes any difference that a transaction that does not modify anything is committed or rolled back. Is there a performance difference? Does it affect other connections? Any other differences?
You commit. Period. There's no other sensible alternative. If you started a transaction, you should close it. Committing releases any locks you may have had, and is equally sensible with ReadUncommitted or Serializable isolation levels. Relying on implicit rollback - while perhaps technically equivalent - is just poor form.
If that hasn't convinced you, just imagine the next guy who inserts an update statement in the middle of your code, and has to track down the implicit rollback that occurs and removes his data.
If you haven't changed anything, then you can use either a COMMIT or a ROLLBACK. Either one will release any read locks you have acquired and since you haven't made any other changes, they will be equivalent.
If you begin a transaction, then best practice is always to commit it. If an exception is thrown inside your use(transaction) block the transaction will be automatically rolled-back.
Consider nested transactions.
Most RDBMSes do not support nested transactions, or try to emulate them in a very limited way.
For example, in MS SQL Server, a rollback in an inner transaction (which is not a real transaction, MS SQL Server just counts transaction levels!) will rollback the everything which has happened in the outmost transaction (which is the real transaction).
Some database wrappers might consider a rollback in an inner transaction as an sign that an error has occured and rollback everything in the outmost transaction, regardless whether the outmost transaction commited or rolled back.
So a COMMIT is the safe way, when you cannot rule out that your component is used by some software module.
Please note that this is a general answer to the question. The code example cleverly works around the issue with an outer transaction by opening a new database connection.
Regarding performance: depending on the isolation level, SELECTs may require a varying degree of LOCKs and temporary data (snapshots). This is cleaned up when the transaction is closed. It does not matter whether this is done via COMMIT or ROLLBACK. There might be a insignificant difference in CPU time spent - a COMMIT is probably faster to parse than a ROLLBACK (two characters less) and other minor differences. Obviously, this is only true for read-only operations!
Totally not asked for: another programmer who might get to read the code might assume that a ROLLBACK implies an error condition.
IMHO it can make sense to wrap read only queries in transactions as (especially in Java) you can tell the transaction to be "read-only" which in turn the JDBC driver can consider optimizing the query (but does not have to, so nobody will prevent you from issuing an INSERT nevertheless). E.g. the Oracle driver will completely avoid table locks on queries in a transaction marked read-only, which gains a lot of performance on heavily read-driven applications.
ROLLBACK is mostly used in case of an error or exceptional circumstances, and COMMIT in the case of successful completion.
We should close transactions with COMMIT (for success) and ROLLBACK (for failure), even in the case of read-only transactions where it doesn't seem to matter. In fact it does matter, for consistency and future-proofing.
A read-only transaction can logically "fail" in many ways, for example:
a query does not return exactly one row as expected
a stored procedure raises an exception
data fetched is found to be inconsistent
user aborts the transaction because it's taking too long
deadlock or timeout
If COMMIT and ROLLBACK are used properly for a read-only transaction, it will continue to work as expected if DB write code is added at some point, e.g. for caching, auditing or statistics.
Implicit ROLLBACK should only be used for "fatal error" situations, when the application crashes or exits with an unrecoverable error, network failure, power failure, etc.
Just a side note, but you can also write that code like this:
using (IDbConnection connection = ConnectionFactory.CreateConnection())
using (IDbTransaction transaction = connection.BeginTransaction(IsolationLevel.ReadUncommitted))
using (IDbCommand command = connection.CreateCommand())
{
command.Transaction = transaction;
command.CommandText = "SELECT * FROM SomeTable";
using (IDataReader reader = command.ExecuteReader())
{
// Do something useful
}
// To commit, or not to commit?
}
And if you re-structure things just a little bit you might be able to move the using block for the IDataReader up to the top as well.
If you put the SQL into a stored procedure and add this above the query:
set transaction isolation level read uncommitted
then you don't have to jump through any hoops in the C# code. Setting the transaction isolation level in a stored procedure does not cause the setting to apply to all future uses of that connection (which is something you have to worry about with other settings since the connections are pooled). At the end of the stored procedure it just goes back to whatever the connection was initialized with.
Given that a READ does not change state, I would do nothing. Performing a commit will do nothing, except waste a cycle to send the request to the database. You haven't performed an operation that has changed state. Likewise for the rollback.
You should however, be sure to clean up your objects and close your connections to the database. Not closing your connections can lead to issues if this code gets called repeatedly.
If you set AutoCommit false, then YES.
In an experiment with JDBC(Postgresql driver), I found that if select query breaks(because of timeout), then you can not initiate new select query unless you rollback.
Do you need to block others from reading the same data? Why use a transaction?
#Joel - My question would be better phrased as "Why use a transaction on a read query?"
#Stefan - If you are going to use AdHoc SQL and not a stored proc, then just add the WITH (NOLOCK) after the tables in the query. This way you dont incur the overhead (albeit minimal) in the application and the database for a transaction.
SELECT * FROM SomeTable WITH (NOLOCK)
EDIT # Comment 3: Since you had "sqlserver" in the question tags, I had assumed MSSQLServer was the target product. Now that that point has been clarified, I have edited the tags to remove the specific product reference.
I am still not sure of why you want to make a transaction on a read op in the first place.
In your code sample, where you have
// Do something useful
Are you executing a SQL Statement that changes data ?
If not, there's no such thing as a "Read" Transaction... Only changes from an Insert, Update and Delete Statements (statements that can change data) are in a Transaction... What you are talking about is the locks that SQL Server puts on the data you are reading, because of OTHER transactions that affect that data. The level of these locks is dependant on the SQL Server Isolation Level.
But you cannot Commit, or ROll Back anything, if your SQL statement has not changed anything.
If you are changing data, then you can change the isolation level without explicitly starting a transation... Every individual SQL Statement is implicitly in a transaction. explicitly starting a Transaction is only necessary to ensure that 2 or more statements are within the same transaction.
If all you want to do is set the transaction isolation level, then just set a command's CommandText to "Set Transaction Isolation level Repeatable Read" (or whatever level you want), set the CommandType to CommandType.Text, and execute the command. (you can use Command.ExecuteNonQuery() )
NOTE: If you are doing MULTIPLE read statements, and want them all to "see" the same state of the database as the first one, then you need to set the isolation Level top Repeatable Read or Serializable...