When working with database transactions, what are the possible conditions (if any) that would cause the final COMMIT statement in a transaction to fail, presuming that all statements within the transaction already executed without issue?
For example... let's say you have some two-phase or three-phase commit protocol where you do a bunch of statements, then wait for some master process to tell you when it is ok to finally commit the transaction:
-- <initial handshaking stuff>
START TRANSACTION;
-- <Execute a bunch of SQL statements>
-- <Inform master of readiness to commit>
-- <Time passes... background transactions happening while we wait>
-- <Receive approval to commit from master (finally!)>
COMMIT;
If your code gets to that final COMMIT statement and sends it to your DBMS, can you ever get an error (uniqueness issue, database full, etc) at that statement? What errors? Why? How do they appear? Does it vary depending on what DBMS you run?
COMMIT may fail. You might have had sufficent resources to log all the changes you wished to make, but lack resources to actually implement the changes.
And that's not considering other reasons it might fail:
The change itself might not fit the constraints of the database.
Power loss stops things from completing.
The level of requested selection concurrency might disallow an update (cursors updating a modified table, for example).
The commit might time out or be on a connection which times out due to starvation issues.
The network connection between the client and the database may be lost.
And all the other "simple" reasons that aren't on the top of my head.
It is possible for some database engines to defer UNIQUE index constraint checking until COMMIT. Obviously if the constraint does not hold true at the time of commit then it will fail.
Sure.
In a multi-user environment, the COMMIT may fail because of changes by other users (e.g. your COMMIT would violate a referential constraint when applied to the now current database...).
Thomas
If you're using two-phase commit, then no. Everything that could go wrong is done in the prepare phase.
There could still be network outage, power less, cosmic rays, etc, during the commit, but even so, the transactions will have been written to permanent storage, and if a commit has been triggered, recovery processes should carry them through.
Hopefully.
Certainly, there could be a number of issues. The act of committing, in and of itself, must make some final, permanent entry to indicate that the transaction committed. If making that entry fails, then the transaction can't commit.
As Ignacio states, there can be deferred constraint checking (this could be any form of constraint, not just unique constraint, depending on the DBMS engine).
SQL Server Specific: flushing FILESTREAM data can be deferred until commit time. That could fail.
One very simple and often overlooked item: hardware failure. The commit can fail if the underlying server dies. This might be disk, cpu, memory, or even network related.
The transaction could fail if it never receives approval from the master (for any number of reasons).
No matter how wonderfully a system may be designed, there is going to be some possibility that a commit will get into a situation where it's impossible to know whether it succeeded or not. In some cases, it may not matter (e.g. if a hard drive holding the database turns into a pile of slag, it may be impossible to tell whether the commit succeeded or not before that occurred but it wouldn't really matter); in others cases, however, this could be a problem. Especially with distributed database systems, if a connection failure occurs at just the right time during a commit, it will be impossible for both sides to be certain of whether the other side is expecting a commit or a rollback.
With MySQL or MariaDB, when used with Galera clustering, COMMIT is when the other nodes in the cluster are checked. So, yes important errors can be discovered by COMMIT, and you must check for these errors.
Related
By mistake, I performed this query in informix using dbaccess session.
Delete from table #without where condition
Realizing my mistake, that I should have used TRUNCATE, I did another foolishness.
I killed the dbaccess session. But the table is exclusively locked and I am not able to do any action on that table.
What are the steps I can do to remove the lock and truncate the table.
1) Restart Informix server
2) onmode -z <sessionid> # Does not work.
I see hell lot of sessions created for the delete query
Is there any other easy way to fix this issue?
Assuming that you are not using Informix SE...
Is the database logged? If so, did you run the statement inside an explicit (BEGIN WORK) transaction?
Analysis
If you've got an unlogged database, then each row that the server's deleted is gone. If you stop the DELETE, it will not undo the partially complete changes. Using an unlogged database means that you do not want guaranteed statement level recovery.
If you've got a regular logged database and no explicit transaction, then the statement is probably still running after the DB-Access session is terminated. Because it is running as a singleton statement, it will complete and commit. Until it does that, if you forcibly take the server down, then fast recovery will rollback the statement (transaction). Given that I see '5 hours ago', I fear your chances of taking the server down in time now are limited.
If you've got a logged database with an explicit transaction, or a MODE ANSI database (where you're always in a transaction), then when the DELETE statement completes, the server will wait for the COMMIT, realize that the session connection is terminated, and will rollback the uncommitted work.
Recovery
If you've got an unlogged database, you can only recover to your last archive. Because it is unlogged, you can't recover it from the logical logs (but other databases in the same instance that are logged can be recovered up to the last logical log).
If you've got a logged database and you can take the server down — preferably under control, but crashing it if necessary — before the DELETE statement completes, then fast recovery will deal with the issue.
If the DELETE has completed and committed and you have good backups, you can consider a point-in-time restore of the database. It will take it offline while you do that (but if the data from the table is all missing, your DB is not going to be functional for a while).
If none of these scenarios applies, then you should contact IBM Technical Support, who may be able to perform minor (and not so minor) miracles.
But, as you may have noticed, a lot depends on the type of database (unlogged, logged, MODE ANSI) and whether there was an explicit transaction in effect when you ran the statement.
The trouble with DBMS is that they're trusting creatures. If you're authorized to do an operation, they assume that you intend to do what you say you want to do, and they go ahead and do it to the best of their ability. When you don't ask it to do what you intended to request, life gets tricky; the DBMS still trusts you and does what you actually asked it to do.
I have a sp called MoveSomeItems which gets some rows from tableA from Foo Db. and moves them to tableA in Bar Db.
I want to test this sp if it really moves the items.
Is it enough to run this sp in a transaction and select the rows to see if they are moved OR I should approach it in a different way?
This depends upon what the impact of it all going wrong is? What impact would having incorrect data in the destination table be, will it kill someone, simply annoy them or is it unlikely anyone will notice? Will it be easy to fix?
There are risks associated with the approach you have given. For instance:
If the database is very busy, it is possible to cause excessive locking or even a deadlock with a transaction that may cause other transactions to fail. Setting the TRANSACTION ISOLATION LEVEL to READ UNCOMITTED and the DEADLOCK PRIORITY to LOW will help to minimise this but not eliminate it entirely.
There is the possibility that other transactions may be running in READ UNCOMMITED isolation mode. In which case they will see the results of the insert temporarily until the roll back is issued.
It is worth noting that if the procedure you are testing calls COMMIT TRANSACTION inside it you might not get the result you want when you call the ROLLBACK.
You might push the database or log to run out of disk space.
You might use up all the available CPU, Memory, Disk IO, Network or some other capacity limit.
Finally, I suspect this is not a complete list. The point I’m trying to make is that it could go wrong in strange ways.
If you have a personal development database that is fully backed up then you wouldn't even need the transaction, simply do a restore after the event. The transaction may well save you some time though. This is the safest solution.
If you are using a shared development database your approach might be acceptable enough, but I would still do a backup just in case, especially if you are already on bad terms with the team.
If you are using a live database it may still be acceptable if the system as a whole is not that critical and can sustain some downtime while you repair things. Again do a backup.
If the database you are looking at is controlling a process that is safety critical or some other mission critical function, don't even go there you may lose the no claims on your liability insurance or worse. In this instance it is best to restore a backup onto a test server and test there thus creating my first scenario. But be warned there are lots of issues that have to be considered when doing this. For instance it may be illegal to use personal information in a test system. Also there may be dependencies on other systems that will need to be mocked out to ensure you don't affect them, for example don't connect a test system to a live email server.
If I havea complex stored proc that I want to be able to test and rollback, I add an input parameter(always as the last parameter), #debug with a default value of 0 (so you don't need to specify it when you are running on prod).
Then I write code at the end to test if the parameter = 1 and if so I run any select queries to shwo me what data I want to see and then send the program to the catch block using raiseerror (Never write multiple transactions without a try catch block) and have it rollback.
This way you can easily check your results on dev and automatically rollback.
I'm looking for something similiar to an SQL transaction. I need the usual protections that transactions provide, but I don't want it to slow down anyone else.
Imagine client A connects to the DB and runs these commands:
BEGIN TRAN
SELECT (something)
(Wait a few seconds maybe.)
UPDATE (something)
COMMIT
Inbetween the SELECT and the UPDATE, client B comes along and attempts to do a query, that under normal circumstances, would end up having to wait for A to COMMIT.
What I'd like is for client A to open it's transaction in such a way that should B come along and perform it's query, client A will find it's transaction immediately rolled back and it's subsequent commands failing. Client B would only experience minimal delay.
(Note that the SELECT and UPDATE are simply illustrative commands.)
Update...
I've got a high priority task (client B) that sometimes (once a month-ish) gets an SQL timeout error, and a low priority task (client A) with a transaction which causes that timeout. I'd rather that the low priority task fails and is reattempted in the next cycle.
I ended up fixing this problem by eliminating the transactions entirely and replacing them with an informal set of flags. The queries were refactored to only do something if the right set of flags are raised and I added something that cleared up abandoned records that the rollback would have cleared in the past.
I fixed my transaction issues by eliminating transactions.
Using SNAPSHOT isolation level will prevent B from blocking. B will see data in the state they were before A issued BEGIN TRANSACTION. Unless B modifies data, they will never block each other.
While not a transaction at all, Optimistic Concurrency may be useful -- it is used by default in LINQ2SQL, etc.
The general idea is that the data is read -- modifications can be independently made -- and then the data written back with a "check" (this is loosely comparable to a Compare and Swap). If the check fails it is up the application to decide what to do (restart the process, proceed anyway, fail).
This naturally doesn't work for all scenarios and may not detect a number of interactions, such as new items added between the "read" and "write". Both the actual read and write can be in separate transactions with the appropriate isolation level; the separate transactions may allow additional transactions to be interleaved.
Of course, depending upon the exact problem and interactions... different isolation levels and/or finer grained locking may be sufficient.
Happy coding.
That is back to front.
You can't have later clients aborting earlier transactions: that's chaos.
You can have snapshot isolation so that client B has a consistent view and isn't blocked (mostly) by client A. Also Wikipedia for more general stuff
Perhaps describe your problem more fully so we can offer suggestions for that...
One thing that I've seen used (but I'm afraid that I don't have any code handy for it) is having transaction A spawn another process which then monitors the transaction. If it sees any blocks caused by the transaction then it immediately issues a KILL to the spid.
If I can find the code for this then I'll add it here.
Do you know of any ORM tool that offers deadlock recovery? I know deadlocks are a bad thing but sometimes any system will suffer from it given the right amount of load. In Sql Server, the deadlock message says "Rerun the transaction" so I would suspect that rerunning a deadlock statement is a desirable feature on ORM's.
I don't know of any special ORM tool support for automatically rerunning transactions that failed because of deadlocks. However I don't think that a ORM makes dealing with locking/deadlocking issues very different. Firstly, you should analyze the root cause for your deadlocks, then redesign your transactions and queries in a way that deadlocks are avoided or at least reduced. There are lots of options for improvement, like choosing the right isolation level for (parts) of your transactions, using lock hints etc. This depends much more on your database system then on your ORM. Of course it helps if your ORM allows you to use stored procedures for some fine-tuned command etc.
If this doesn't help to avoid deadlocks completely, or you don't have the time to implement and test the real fix now, of course you could simply place a try/catch around your save/commit/persist or whatever call, check catched exceptions if they indicate that the failed transaction is a "deadlock victim", and then simply recall save/commit/persist after a few seconds sleeping. Waiting a few seconds is a good idea since deadlocks are often an indication that there is a temporary peak of transactions competing for the same resources, and rerunning the same transaction quickly again and again would probably make things even worse.
For the same reason you probably would wont to make sure that you only try once to rerun the same transaction.
In a real world scenario we once implemented this kind of workaround, and about 80% of the "deadlock victims" succeeded on the second go. But I strongly recommend to digg deeper to fix the actual reason for the deadlocking, because these problems usually increase exponentially with the number of users. Hope that helps.
Deadlocks are to be expected, and SQL Server seems to be worse off in this front than other database servers. First, you should try to minimize your deadlocks. Try using the SQL Server Profiler to figure out why its happening and what you can do about it. Next, configure your ORM to not read after making an update in the same transaction, if possible. Finally, after you've done that, if you happen to use Spring and Hibernate together, you can put in an interceptor to watch for this situation. Extend MethodInterceptor and place it in your Spring bean under interceptorNames. When the interceptor is run, use invocation.proceed() to execute the transaction. Catch any exceptions, and define a number of times you want to retry.
An o/r mapper can't detect this, as the deadlock is always occuring inside the DBMS, which could be caused by locks set by other threads or other apps even.
To be sure a piece of code doesn't create a deadlock, always use these rules:
- do fetching outside the transaction. So first fetch, then perform processing then perform DML statements like insert, delete and update
- every action inside a method or series of methods which contain / work with a transaction have to use the same connection to the database. This is required because for example write locks are ignored by statements executed over the same connection (as that same connection set the locks ;)).
Often, deadlocks occur because either code fetches data inside a transaction which causes a NEW connection to be opened (which has to wait for locks) or uses different connections for the statements in a transaction.
I had a quick look (no doubt you have too) and couldn't find anything suggesting that hibernate at least offers this. This is probably because ORMs consider this outside of the scope of the problem they are trying to solve.
If you are having issues with deadlocks certainly follow some of the suggestions posted here to try and resolve them. After that you just need to make sure all your database access code gets wrapped with something which can detect a deadlock and retry the transaction.
One system I worked on was based on “commands” that were then committed to the database when the user pressed save, it worked like this:
While(true)
start a database transaction
Foreach command to process
read data the command need into objects
update the object by calling the command.run method
EndForeach
Save the objects to the database
If not deadlock
commit the database transaction
we are done
Else
abort the database transaction
log deadlock and try again
EndIf
EndWhile
You may be able to do something like with any ORM; we used an in house data access system, as ORM were too new at the time.
We run the commands outside of a transaction while the user was interacting with the system. Then rerun them as above (when you use did a "save") to cope with changes other people have made. As we already had a good ideal of the rows the command would change, we could even use locking hints or “select for update” to take out all the write locks we needed at the start of the transaction. (We shorted the set of rows to be updated to reduce the number of deadlocks even more)
The in-house application framework we use at my company makes it necessary to put every SQL query into transactions, even though if I know that none of the commands will make changes in the database. At the end of the session, before closing the connection, I commit the transaction to close it properly. I wonder if there were any particular difference if I rolled it back, especially in terms of speed.
Please note that I am using Oracle, but I guess other databases have similar behaviour. Also, I can't do anything about the requirement to begin the transaction, that part of the codebase is out of my hands.
Databases often preserve either a before-image journal (what it was before the transaction) or an after-image journal (what it will be when the transaction completes.) If it keeps a before-image, that has to be restored on a rollback. If it keeps an after-image, that has to replace data in the event of a commit.
Oracle has both a journal and rollback space. The transaction journal accumulates blocks which are later written by DB writers. Since these are asychronous, almost nothing DB writer related has any impact on your transaction (if the queue fills up, then you might have to wait.)
Even for a query-only transaction, I'd be willing to bet that there's some little bit of transactional record-keeping in Oracle's rollback areas. I suspect that a rollback requires some work on Oracle's part before it determines there's nothing to actually roll back. And I think this is synchronous with your transaction. You can't really release any locks until the rollback is completed. [Yes, I know you aren't using any in your transaction, but the locking issue is why I think a rollback has to be fully released then all the locks can be released, then your rollback is finished.]
On the other hand, the commit is more-or-less the expected outcome, and I suspect that discarding the rollback area might be slightly faster. You created no transaction entries, so the db writer will never even wake up to check and discover that there was nothing to do.
I also expect that while commit may be faster, the differences will be minor. So minor, that you might not be able to even measure them in a side-by-side comparison.
I agree with the previous answers that there's no difference between COMMIT and ROLLBACK in this case. There might be a negligible difference in the CPU time needed to determine that there's nothing to COMMIT versus the CPU time needed to determine that there's nothing to ROLLBACK. But, if it's a negligible difference, we can safely forget about about it.
However, it's worth pointing out that there's a difference between a session that does a bunch of queries in the context of a single transaction and a session that does the same queries in the context of a series of transactions.
If a client starts a transaction, performs a query, performs a COMMITor ROLLBACK, then starts a second transaction and performs a second query, there's no guarantee that the second query will observe the same database state as the first query. Sometimes, maintaining a single consistent view of the data is of the essence. Sometimes, getting a more current view of the data is of the essence. It depends on what you are doing.
I know, I know, the OP didn't ask this question. But some readers may be asking it in the back of their minds.
In general a COMMIT is much faster than a ROLLBACK, but in the case where you have done nothing they are effectively the same.
The documentation states that:
Oracle recommends that you explicitly end every transaction in your application programs with a COMMIT or ROLLBACK statement, including the last transaction, before disconnecting from Oracle Database. If you do not explicitly commit the transaction and the program terminates abnormally, then the last uncommitted transaction is automatically rolled back. A normal exit from most Oracle utilities and tools causes the current transaction to be committed. A normal exit from an Oracle precompiler program does not commit the transaction and relies on Oracle Database to roll back the current transaction.
http://download.oracle.com/docs/cd/B28359_01/server.111/b28286/statements_4010.htm#SQLRF01110
If you want o choose to do one or the other then you might as well do the one that is the same as doing nothing, and just commit it.
Well, we must take into account what an SELECT returns in Oracle. There are two modes. By default an SELECT returns data as that data looked in the very moment the SELECT statement started executing (this is default behavior in READ COMMITTED isolation mode, the default transactional mode). So if an UPDATE/INSERT was executed after SELECT was issued that won't be visible in result set.
This can be a problem if you need to compare two result sets (for example debta and credit sides of an general ledger app). For that we have a second mode. In that mode SELECT returns data as it looked at the moment the current transaction began (default behavior in READ ONLY and SERIALIZABLE isolation levels).
So, at least sometimes it is necessary to execute SELECTs in transaction.
Since you've not done any DML, I suspect there'd be no difference between a COMMIT and ROLLBACK in Oracle. Either way there's nothing to do.
I'd think a Commit would be more efficient; since generally you'd expect most DB transactions to be committed; so you would think the DB optimizes for this case (as opposed to trying to be more efficient for a rollback).