I'm coding my own data layer using JDBC for accessing SQL databases and one of the main components is the transaction manager.
I'm a little bit confused whether or not to support the nested transactions in this component. A sample code of a nested transaction is as follows:
tx1.begin();
... // do something with tx1
tx2.begin();
... // do something with tx2
tx2.commit();
...
tx1.commit();
During my past development experiments, I've never needed them and I think that they make the code more complex. But, I'm not sure that they are useless or useful. Can you give some example cases in which a nested transaction is required or at least advantageous? And what are the pros and cons of them?
To clarify my question and to explain what I mean by a transaction, I pasted my comment below:
I'm using JDBC. So, the transaction manager is independent of the underlying database. By transaction, I mean non-autoCommit JDBC connections. The transaction manager returns a transaction object with a non-autoCommit connection. The client code using this transcaction, commits and closes the connection by committing the transaction object.
Thanks in advance.
Related
I have been reading the SQLite documentation and also referencing code I have written previously but I don't seem to be able to find a definitive answer to what I imagine to be a rather simple question.
I would like to execute many (separate) compiled statements within a transaction, but child threads may also be creating transactions or just executing statements at the same time and I would not want them included in this particular transaction. Currently, I have a single database handle that I share between all threads.
So, my question is,
1) .. is it generally better to have some kind of semaphore around transactions to ensure they will not clash/collect with other statements being executed against a database handle. I already marshal writes to prevent problems with multithreaded issues with SQLite (although with WAL now it's very hard to unsettle it at all).
2) .. or are you expected to open multiple database connections and start/commit the transactions one per database connection if they will be concurrent?
Changes made in one database connection are invisible to all other database connections prior to commit.
So it seems a hybrid approach of having several connections open to the database provides adequate concurrency guarantees, trading off the expense of opening a new connection with the benefit of allowing multi-threaded write transactions.
A query sees all changes that are completed on the same database connection prior to the start of the query, regardless of whether or not those changes have been committed.
If changes occur on the same database connection after a query starts running but before the query completes, then it is undefined whether or not the query will see those changes.
If changes occur on the same database connection after a query starts running but before the query completes, then the query might return a changed row more than once, or it might return a row that was previously deleted.
For the purposes of the previous four items, two database connections that use the same shared cache and which enable PRAGMA read_uncommitted are considered to be the same database connection, not separate database connections.
Here is the SQLite information on isolation. Which is exceptionally useful to read and understand for this problem.
I am unclear on the above statement and how it ties into the various propogation levels within JTA. When a method is annotated with a Transactional Attribute as "Requires_New, a new transaction is always started even when a transaction is already existing. Is this not a nested transaction? Also Spring supports the "Nested" as an additional transactional attribute (over JEE).
Can anyone please explain what this means?
Thanks
Sirish
Many databases does not actually implement nested transactions but uses transaction savepoints instead, so when it appears like a nested transaction is started in the application, what really happens is just that a transaction savepoint is created in the database, and if something happens inside what appears to be a nested transaction in the application, the database just rolls back to the latest savepoint.
https://en.wikipedia.org/wiki/Savepoint
I'm not sure if this is really related to subject, though ...
I was wondering if it's a good practice to nest two transactions? For example wrapping my NHibernate transaction with TransactionScope for the benefit of the Tests (making sure that the db rollbacks all the changes that were made in the test).
The other option is to keep the entities that I insert into the Db in memory and delete them at the end of the test.
Which one is better?
First of all, nhibernate doesn't support nested transactions!
TransactionScope on the other side will not create a new transaction if there is already one opened. If you only use transaction scope, it will create a new transaction for the connection.
If you then open a transaction within the scope, this will still work with nhibernate.
Back to your question, it pretty much depends on the amount of objects you create within the TransactionScope. If it becomes too many, you will simply spam the transaction log of your database. Apart from that, the concept is perfectly fine I would say.
And one important thing to mention, if you use TransactionScope, and you create multiple sessions/transaction with nhibernate, the scope might switch to distributed transactions which requires MSDTC to run on the target server, otherwise it will simply fail.
I have a question. Imagine you'd have an object you'd want to save in a transaction, the object having collections of other objects etc so its a more "complex" object.
Anyway, sometimes we save objects like that, but in the meantime, we use another thread that occasionally reads said data and synchronizes them up to our central server. However we've noticed problems that in some occasions objects get synced over without all the collection objects.
Since this only happens every once in a while we figured it could be the transaction isolation level. Maybe the synchronization thread reads the data before the transaction is done persisting all the objects, thus only reading half the data needed, and sending it over.
Because we all know that the clients data is all saved, all the time, it's just that sometimes it doesn't tag along when it's being sent to us.
So we'd want some kind of lock I suppose, I just don't know anything about these locks. Which one should we use?
There are no outside sources working towards the database in this case, since it's a WPF application on a client's customer.
Any help would be appreciated!
Best regards,
E.
Every database supports a set of standard isolation levels. These are all meant to prevent to a certain level that you read data that is modified inside another transaction. I suggest you first read up on what these isoloation levels mean.
In your specific situation, I'd suggest that for the transaction that is reading the data, you use at least an isolation level of ReadCommitted. In code, this would look like this:
using (var transactionScope = new TransactionScope(TransactionScopeOption.Required,
new TransactionOptions { IsolationLevel = IsolationLevel.ReadCommitted }))
{
// Read the data you want from the database.
...
transactionScope.Complete();
// Return the data.
}
Using a TransactionScope with IsolationLevel.ReadCommitted prevents that you read data that has not yet been committed by another transaction.
The code that writes data to the database should also be put inside one transaction. As long as you only write data inside that transaction, the isolation level for that transaction doesn't matter. This guarantees the atomicity of your updates: either all updates succeed or none of them. This also prevents another transaction from reading a partial update.
Do you know of any ORM tool that offers deadlock recovery? I know deadlocks are a bad thing but sometimes any system will suffer from it given the right amount of load. In Sql Server, the deadlock message says "Rerun the transaction" so I would suspect that rerunning a deadlock statement is a desirable feature on ORM's.
I don't know of any special ORM tool support for automatically rerunning transactions that failed because of deadlocks. However I don't think that a ORM makes dealing with locking/deadlocking issues very different. Firstly, you should analyze the root cause for your deadlocks, then redesign your transactions and queries in a way that deadlocks are avoided or at least reduced. There are lots of options for improvement, like choosing the right isolation level for (parts) of your transactions, using lock hints etc. This depends much more on your database system then on your ORM. Of course it helps if your ORM allows you to use stored procedures for some fine-tuned command etc.
If this doesn't help to avoid deadlocks completely, or you don't have the time to implement and test the real fix now, of course you could simply place a try/catch around your save/commit/persist or whatever call, check catched exceptions if they indicate that the failed transaction is a "deadlock victim", and then simply recall save/commit/persist after a few seconds sleeping. Waiting a few seconds is a good idea since deadlocks are often an indication that there is a temporary peak of transactions competing for the same resources, and rerunning the same transaction quickly again and again would probably make things even worse.
For the same reason you probably would wont to make sure that you only try once to rerun the same transaction.
In a real world scenario we once implemented this kind of workaround, and about 80% of the "deadlock victims" succeeded on the second go. But I strongly recommend to digg deeper to fix the actual reason for the deadlocking, because these problems usually increase exponentially with the number of users. Hope that helps.
Deadlocks are to be expected, and SQL Server seems to be worse off in this front than other database servers. First, you should try to minimize your deadlocks. Try using the SQL Server Profiler to figure out why its happening and what you can do about it. Next, configure your ORM to not read after making an update in the same transaction, if possible. Finally, after you've done that, if you happen to use Spring and Hibernate together, you can put in an interceptor to watch for this situation. Extend MethodInterceptor and place it in your Spring bean under interceptorNames. When the interceptor is run, use invocation.proceed() to execute the transaction. Catch any exceptions, and define a number of times you want to retry.
An o/r mapper can't detect this, as the deadlock is always occuring inside the DBMS, which could be caused by locks set by other threads or other apps even.
To be sure a piece of code doesn't create a deadlock, always use these rules:
- do fetching outside the transaction. So first fetch, then perform processing then perform DML statements like insert, delete and update
- every action inside a method or series of methods which contain / work with a transaction have to use the same connection to the database. This is required because for example write locks are ignored by statements executed over the same connection (as that same connection set the locks ;)).
Often, deadlocks occur because either code fetches data inside a transaction which causes a NEW connection to be opened (which has to wait for locks) or uses different connections for the statements in a transaction.
I had a quick look (no doubt you have too) and couldn't find anything suggesting that hibernate at least offers this. This is probably because ORMs consider this outside of the scope of the problem they are trying to solve.
If you are having issues with deadlocks certainly follow some of the suggestions posted here to try and resolve them. After that you just need to make sure all your database access code gets wrapped with something which can detect a deadlock and retry the transaction.
One system I worked on was based on “commands” that were then committed to the database when the user pressed save, it worked like this:
While(true)
start a database transaction
Foreach command to process
read data the command need into objects
update the object by calling the command.run method
EndForeach
Save the objects to the database
If not deadlock
commit the database transaction
we are done
Else
abort the database transaction
log deadlock and try again
EndIf
EndWhile
You may be able to do something like with any ORM; we used an in house data access system, as ORM were too new at the time.
We run the commands outside of a transaction while the user was interacting with the system. Then rerun them as above (when you use did a "save") to cope with changes other people have made. As we already had a good ideal of the rows the command would change, we could even use locking hints or “select for update” to take out all the write locks we needed at the start of the transaction. (We shorted the set of rows to be updated to reduce the number of deadlocks even more)