What is the best default transaction isolation level for an ERP, if any? - sql

Short background: We are just starting to migrate/reimplement an ERP system to Java with Hibernate, targeting a concurrent user count of 50-100 users using the system. We use MS SQL Server as database server, which is good enough for this loads.
Now, the old system doesn't use any transactions at all and relies for critical parts (e.g. stock changes) on setting manual locks (using flags) and releasing them. That's something like manual transaction management. But there are sometimes problems with data inconsistency. In the new system we would like to use transactions to wipe out these problems.
Now the question: What would be a good/reasonable default transaction isolation level to use for an ERP system, given a usage of about 85% OLTP and 15% OLAP? Or should I always decide on a per task basis, which transaction level to use?
And as a reminder the four transaction isolation levels: READ UNCOMMITTED, READ COMMITTED, REPEATABLE READ, SERIALIZABLE

99 times out of 100, read committed is the right answer. That ensures that you only see changes that have been committed by the other session (and, thus, results that are consistent, assuming you've designed your transactions correctly). But it doesn't impose the locking overhead (particularly in non-Oracle databases) that repeatable read or serializable impose.
Very occasionally, you may want to run a report where you are willing to sacrifice accuracy for speed and set a read uncommitted isolation level. That's rarely a good idea, but it is occasionally a reasonably acceptable workaround to lock contention issues.
Serializable and repeatable read are occasionally used when you have a process that needs to see a consistent set of data over the entire run regardless of what other transactions are doing at the time. It may be appropriate to set a month-end reconciliation process to serializable, for example, if there is a lot of procedureal code, a possibility that users are going to be making changes while the process is running and a requirement that the process needs to ensure that it is always seeing the data as it existed at the time the reconciliation started.

Don't forget about SNAPSHOT, which is right below SERIALIZABLE.
It depends on how important it is for the data to be accurate in the reports. It really is a task-by-task thing.

It really depends a lot on how you design your application, the easy answer is just run at READ_COMMITTED.
You can make an argument that if you design your system with it in mind that you could use READ_UNCOMMITTED as the default and only increase the isolation level when you need it. The vast majority of your transactions are going to succeed anyway so reading uncommitted data won't be a big deal.
The way isolation levels effect your queries depends on your target database. For instance databases like Sybase and MSSQL must lock more resources when you run READ_COMMITTED, than databases like Oracle.

For SQL Server (and probably most major RDBMS), I'd stick with the default. For SQL Server, this is READ COMMITTED. Anything more and you start overtaxing the DB, anything less and you've got consistency issues.

Read Uncommitted is definitely the underdog in most forums. However, there are reasons to use it that go beyond a matter of "speed versus accuracy" that is often pointed out.
Let's say you have:
Transaction T1: Writes B, Reads A, (some more work), Commit.
Transaction T2: Writes A, Reads B, (some more work), Commit.
With read committed, the transactions above won't release until committing. Then you can run into a situation where T1 is waiting for T2 to release A, and T2 is waiting for T1 to release B. Here the two transactions collide in a lock.
You could re-write those procedures to avoid this scenario (example: acquire resources always in alphabetical order!). Still, with too many concurrent users and tens of thousands of lines of code, this problem may become both very likely and very difficult to diagnose and resolve.
The alternative is using Read Uncommitted. Then you design your transactions assuming that there may be dirty reads. I personally find this problem much more localized and treatable than the interlocking trainwrecks.
The issues from dirty reads can be preempted by
(1) Rollbacks: don't. This should be the last line of defense in case of hardware failure, network failure or program crash only.
(2) Use application locks to create a locking mechanism that operates
at a higher level of abstraction, where each lock is closer to a
real-world resource or action.

Related

Choosing transaction isolation levels

I don't have a specific example here, I was just trying to understand the different levels of transaction isolation and how one might go about deciding which is best for a given situation.
I'm trying to think of situations in which I would want a transaction that is not serializable, other than to possibly increase performance in situations where I'm willing to give up a little data integrity.
Can anybody provide an example of a situation in which "read uncommitted", "read committed", and/or "repeatable read" would be the preferable isolation level?
Using the serializable isolation level does not only have advantages, but also disadvantages:
You have to accept increased performance overhead.
You have to handle serialization errors by redoing transactions, which complicates your application code and hurts performance if it happens often.
I'll come up with use cases for the other transaction levels. This list is of course not complete:
READ UNCOMMITTED: If you request this isolation level, you will actually get READ COMMITTED. So this isolation level is irrelevant. On database systems that use read locks, you use that isolation level to avoid them.
READ COMMITTED: This is the best isolation level if you are ready to deal with concurrent transactions yourself by locking rows that you want to be stable. The big advantage is that you never have to deal with serialization errors (except when you get a deadlock).
REPEATABLE READ: This isolation level is perfect for long running read-only transactions that want to see a consistent state of the database. The prime example is pg_dump.

Fastest locking strategy (isolation level) for single user batch job

We have a SQL Server 2005 database that contains preprocessed data for reports.
All data is deleted and rebuild from scratch every night, based on data from the production database.
During the night the job is the only thing running on that server, so I have no simultaneous access to my data.
We are currently using the default READ_COMMITTED isolation level.
I understand that SQL Server will put locks on tables, for reading and writing. Since no one else is touching my tables (both the tables I read and those I write) while my job is running, would it be faster to specify WITH (NOLOCK), or using exclusive table locks (TABLOCKX) ?
Thanks for any hints,
Yves
You probably won't notice any difference from a CPU-usage standpoint. READ COMMITTED does not even take S-locks on rows that sit on pages that do not have uncommitted rows on them. This is a little known optimization in SQL Server for the very common READ COMMITTED isolation level.
I recommend that you consider READ UNCOMMITTED (which is exactly equivalent to NOLOCK) or TABLOCK because it allows SQL Server to scan tables in allocation order (an IAM scan) as opposed to logical index order. This is good for IO patterns and depending on the degree fragmentation can make a significant impact (or none at all).
For bulk writes look into the relevant guides put out by Microsoft. Make sure you take advantage of minimally-logged writes. TABLOCKX can come into play here.

Do database transactions prevent race conditions?

It's not entirely clear to me what transactions in database systems do. I know they can be used to rollback a list of updates completely (e.g. deduct money on one account and add it to another), but is that all they do? Specifically, can they be used to prevent race conditions? For example:
// Java/JPA example
em.getTransaction().begin();
User u = em.find(User.class, 123);
u.credits += 10;
em.persist(u); // Note added in 2016: this line is actually not needed
em.getTransaction().commit();
(I know this could probably be written as a single update query, but that's not alway the case)
Is this code protected against race conditions?
I'm mostly interested in MySQL5 + InnoDB, but general answers are welcome too.
TL/DR: Transactions do not inherently prevent all race conditions. You still need locking, abort-and-retry handling, or other protective measures in all real-world database implementations. Transactions are not a secret sauce you can add to your queries to make them safe from all concurrency effects.
Isolation
What you're getting at with your question is the I in ACID - isolation. The academically pure idea is that transactions should provide perfect isolation, so that the result is the same as if every transaction executed serially. In reality that's rarely the case in real RDBMS implementations; capabilities vary by implementation, and the rules can be weakened by use of a weaker isolation level like READ COMMITTED. In practice you cannot assume that transactions prevent all race conditions, even at SERIALIZABLE isolation.
Some RDBMSs have stronger abilities than others. For example, PostgreSQL 9.2 and newer have quite good SERIALIZABLE isolation that detects most (but not all) possible interactions between transactions and aborts all but one of the conflicting transactions. So it can run transactions in parallel quite safely.
Few, if any3, systems have truly perfect SERIALIZABLE isolation that prevents all possible races and anomalies, including issues like lock escalation and lock ordering deadlocks.
Even with strong isolation some systems (like PostgreSQL) will abort conflicting transactions, rather than making them wait and running them serially. Your app must remember what it was doing and re-try the transaction. So while the transaction has prevented concurrency-related anomalies from being stored to the DB, it's done so in a manner that is not transparent to the application.
Atomicity
Arguably the primary purpose of a database transaction is that it provides for atomic commit. The changes do not take effect until you commit the transaction. When you commit, the changes all take effect at the same instant as far as other transactions are concerned. No transaction can ever see just some of the changes a transaction makes1,2. Similarly, if you ROLLBACK, then none of the transaction's changes ever get seen by any other transaction; it's as if your transaction never existed.
That's the A in ACID.
Durability
Another is durability - the D in ACID. It specifies that when you commit a transaction it must truly be saved to storage that will survive a fault like power loss or a sudden reboot.
Consistency:
See wikipedia
Optimistic concurrency control
Rather than using locking and/or high isolation levels, it's common for ORMs like Hibernate, EclipseLink, etc to use optimistic concurrency control (often called "optimistic locking") to overcome the limitations of weaker isolation levels while preserving performance.
A key feature of this approach is that it lets you span work across multiple transactions, which is a big plus with systems that have high user counts and may have long delays between interactions with any given user.
References
In addition to the in-text links, see the PostgreSQL documentation chapter on locking, isolation and concurrency. Even if you're using a different RDBMS you'll learn a lot from the concepts it explains.
1I'm ignoring the rarely implemented READ UNCOMMITTED isolation level here for simplicity; it permits dirty reads.
2As #meriton points out, the corollary isn't necessarily true. Phantom reads occur in anything below SERIALIZABLE. One part of an in-progress transaction doesn't see some changes (by a not-yet-committed transaction), then the next part of the in-progress transaction does see the changes when the other transaction commits.
3 Well, IIRC SQLite2 does by virtue of locking the whole database when a write is attempted, but that's not what I'd call an ideal solution to concurrency issues.
The database tier supports atomicity of transactions to varying degrees, called isolation levels. Check the documentation of your database management system for the isolation levels supported, and their trade-offs. The strongest isolation level, Serializable, requires transactions to execute as if they were executed one by one. This is typically implemented by using exclusive locks in the database. This can be cause deadlocks, which the database management system detects and fixes by rolling back some involved transactions. This approach is often referred to as pessimistic locking.
Many object-relational mappers (including JPA providers) also support optimistic locking, where update conflicts are not prevented in the database, but detected in the application tier, which then rolls back the transaction. If you have optimistic locking enabled, a typical execution of your example code would emit the following sql queries:
select id, version, credits from user where id = 123;
Let's say this returns (123, 13, 100).
update user set version = 14, credit = 110 where id = 123 and version = 13;
The database tells us how many rows where updated. If it was one, there was no conflicting update. If it was zero, a conflicting update occurred, and the JPA provider will do
rollback;
and throw an exception so application code can handle the failed transaction, for instance by retrying.
Summary: With either approach, your statement can be made safe from race conditions.
It depends on isolation level (in serializable it will prevent race condition, since generally in serializable isolation level transactions are processed in sequence, not in paralell (or at least exclusive locking is used, so transactions, that modify the same rows, are performed in sequence).
In order to prevent the race condition, better manually lock the record (mysql for example supports 'select ... for update' statement, which aquires write-lock on the selected records)
It depends on the specific rdbms. Generally, transactions acquire locks as decided during the query evaluation plan. Some can request table level locks, other column level, other record level, the second is preferred for performance. The short answer to your question is yes.
In other words, a transaction is meant to group a set of queries and represent them as an atomic operation. If the operation fails the changes are rolledback. I don't exactly know what the adapter you're using does, but if it conforms to the definition of transactions you should be fine.
While this guarantees prevention of race conditions, it doesn't explicitly prevent starvation or deadlocks. The transaction lock manager is in charge of that. Table locks are sometime used, but they come with a hefty price of reducing the number of concurrent operations.

Which isolation mode should you choose if you want the least concurrency?

If you need to minimize concurrency as much as possible, which isolation level (repeatable read, serializable, read committed, read uncomitted) would work best?
Serializable gives the most isolation, thus least concurrency.
http://en.wikipedia.org/wiki/Isolation_(database_systems)
I'm guessing you really want to maximize concurrency as much as possible here, to increase performance. Unfortunately, simply choosing an isolation mode won't do the trick. The real question about those isolation modes is, can you use them in your particular application?
That really depends on the intimate details of your application, and that's probably not something we can debug on Stack Overflow.
However, in general, assuming you don't get data corruption, from most concurrent to least, the isolation levels for Oracle are:
read uncommitted
read committed
repeatable read
serializable.
It's different for, say, PostgreSQL because it uses a different synchronization model (MVCC), where reading is free, but when you write you run the risk of rollback.
I suppose the real answer to this question is, ask and get leads to many days of study materials, or just hire someone to deal with your particular situation. While it's very technical, there are no hard and fast rules: you need to understand both the theory behind what's going on and the specific situation in order to make a useful recommendation.

When do transactions become more of a burden than a benefit?

Transactional programming is, in this day and age, a staple in modern development. Concurrency and fault-tolerance are critical to an applications longevity and, rightly so, transactional logic has become easy to implement. As applications grow though, it seems that transactional code tends to become more and more burdensome on the scalability of the application, and when you bridge into distributed transactions and mirrored data sets the issues start to become very complicated. I'm curious what seems to be the point, in data size or application complexity, that transactions frequently start becoming the source of issues (causing timeouts, deadlocks, performance issues in mission critical code, etc) which are more bothersome to fix, troubleshoot or workaround than designing a data model that is more fault-tolerant in itself, or using other means to ensure data integrity. Also, what design patterns serve to minimize these impacts or make standard transactional logic obsolete or a non-issue?
--
EDIT: We've got some answers of reasonable quality so far, but I think I'll post an answer myself to bring up some of the things I've heard about to try to inspire some additional creativity; most of the responses I'm getting are pessimistic views of the problem.
Another important note is that not all dead-locks are a result of poorly coded procedures; sometimes there are mission critical operations that depend on similar resources in different orders, or complex joins in different queries that step on each other; this is an issue that can sometimes seem unavoidable, but I've been a part of reworking workflows to facilitate an execution order that is less likely to cause one.
I think no design pattern can solve this issue in itself. Good database design, good store procedure programming and especially learning how to keep your transactions short will ease most of the problems.
There is no 100% guaranteed method of not having problems though.
In basically every case I've seen in my career though, deadlocks and slowdowns were solved by fixing the stored procedures:
making sure all tables are accessed in order prevents deadlocks
fixing indexes and statistics makes everything faster (hence diminishes the chance of deadlock)
sometimes there was no real need of transactions, it just "looked" like it
sometimes transactions could be eliminated by making multiple statement stored procedures in single statement ones.
The use of shared resources is wrong in the long run. Because by reusing an existing environment you are creating more and more possibilities. Just review the busy beavers :) The way Erlang goes is the right way to produce fault-tolerant and easily verifiable systems.
But transactional memory is essential for many applications in widespread use. If you consult a bank with its millions of customers for example you can't just copy the data for the sake of efficiency.
I think monads are a cool concept to handle the difficult concept of changing state.
One approach I've heard of is to make a versioned insert only model where no updates ever occur. During selects the version is used to select only the latest rows. One downside I know of with this approach is that the database can get rather large very quickly.
I also know that some solutions, such as FogBugz, don't use enforced foreign keys, which I believe would also help mitigate some of these problems because the SQL query plan can lock linked tables during selects or updates even if no data is changing in them, and if it's a highly contended table that gets locked it can increase the chance of DeadLock or Timeout.
I don't know much about these approaches though since I've never used them, so I assume there are pros and cons to each that I'm not aware of, as well as some other techniques I've never heard about.
I've also been looking into some of the material from Carlo Pescio's recent post, which I've not had enough time to do it justice unfortunately, but the material seems very interesting.
If you are talking 'cloud computing' here, the answer would be to localize each transaction to the place where it happens in the cloud.
There is no need for the entire cloud to be consistent, as that would kill performance (as you noted). Simply, keep track of what is changed and where and handle multiple small transactions as changes propagate through the system.
The situation where user A updates record R and user B at the other end of cloud does not see it (yet) is the same as the one when user A didn't do the change yet in the current strict-transactional environment. This could lead to a discrepancy in an update-heavy system, so systems should be architectured to work with updates as less as possible - moving things to aggregation of data and pulling out the aggregates once the exact figure is critical (i.e. moving requirement for consistency from write-time to critical-read-time).
Well, just my POV. It's hard to conceive a system that is application agnostic in this case.
Try to make changes at the database level in the least number of possible instructions.
The general rule is to lock a resource the lest possible time. Using T-SQL, PLSQL, Java on Oracle or any similar way you can reduce the time that each transaction locks a shared resource. I fact transactions in the database are optimized with row-level locks, multi-version, and other kinds of intelligent techniques. If you can make the transaction at the database you save the network latency. Apart from other layers like ODBC/JDBC/OLEBD.
Sometimes the programmer tries to obtain the good things of a database ( It is transactional, parallel, distributed, ) but keep a caché of the data. Then they need to add manually some of the database features.