Performant Distributed Locking - ignite

I am evaluating the best approach to distributed locking. The oob reentrant locking support in Ignite is tied to the thread that acquires locks. Our requirements need locking and unlocking not tied to the same thread/process, one thread/process can lock and other thread/process can unlock. So while evaluating alternatives, we came across 2 options,
Use semaphores - This works but is extremely slow. Cannot use this.
Do manual locking - Use a separate cache where we add entries for locks and remove entries for unlock. But it needs too many cases to handle like nodes going down during put ops, which other node gets through, etc.
Just wanted to check if there are other oob performant options available that we might have overlooked.
TIA

Related

Apache Ignite Continuous Query

Are Continuous Queries executed in a multi-threaded mode or just a single thread? I am trying to find out what the performance implications are when a millions of entries are added to a cache for which ContinuousQuery is enabled.
Well, both - depends on what you mean by "multi-threaded". Query's remote filter is executed by the same thread that performs the cache update, but the updates themselves are generally performed in multiple threads.
On the performance considerations: calling filters and listeners is relatively fast and having a continuous query should not slow down your application, but make sure not to put heavy code into them and not to acquire locks or use transactions to avoid deadlocks.

How simultaneous queries are handled in a MySQL database?

I am using MySQL database and I would like to know if I make multiple (500 and more) queries simultaneously in order to get information from multiple tables, how these queries are handled? Sequentially or in parallel?
Queries are always handled in parallel between multiple sessions (i.e. client connections). All queries on a single connections are run one-after-another. The level of parallelism between multiple connections can be configured depending on your available server resources.
Generally, some operations are guarded between individual query sessions (called transactions). These are supported by InnoDB backends, but not MyISAM tables (but it supports a concept called atomic operations). There are various level of isolation which differ in which operations are guarded from each other (and thus how operations in one parallel transactions affect another) and in their performance impact.
For more information read about transactions in general and the implementation in MySQL.
Each connection can run a maximum of one query at once, and does it in a single thread. The server opens one thread per query.
Normally, hopefully, queries don't block each other. Depending on the engine and the queries however, they may. There is a lot of locking in MySQL which is discussed in some detail in the manual.
However, if they don't block each other, they can still slow each other down by consuming resources. IO operations are a particular source of these slow-downs. If your data don't fit in memory, you should really limit the number of parallel queries to what your IO subsystem can handle, or things will get really bad. Measurement is the key.
I would normally say that if 500 queries are running at once (and NOT waiting on locks), you may not be getting best value from your hardware (do you have 500 cores? How many threads are waiting for IO?)
Normally all queries will be run in parallel.
But... there are some exceptions to that. Depending on your transaction isolation level a row can be locked while updating. Read more about that over here: http://dev.mysql.com/doc/refman/5.1/en/dynindex-isolevel.html

Are we asking too much of transactional memory?

I've been reading up a lot about transactional memory lately. There is a bit of hype around TM, so a lot of people are enthusiastic about it, and it does provide solutions for painful problems with locking, but you regularly also see complaints:
You can't do I/O
You have to write your atomic sections so they can run several times (be careful with your local variables!)
Software transactional memory offers poor performance
[Insert your pet peeve here]
I understand these concerns: more often than not, you find articles about STMs that only run on some particular hardware that supports some really nifty atomic operation (like LL/SC), or it has to be supported by some imaginary compiler, or it requires that all accesses to memory be transactional, it introduces type constraints monad-style, etc. And above all: these are real problems.
This has lead me to ask myself: what speaks against local use of transactional memory as a replacement for locks? Would this already bring enough value, or must transactional memory be used all over the place if used at all?
Yes, some of the problems you mention can be real ones now, but things evolve.
As any new technology, first there is a hype, then the new technology shows that there are some unresolved problems, and then some of these problems are solved and others not. This result in another possibility to solve your problems, for which this technology is the more adapted.
I will say that you can use STM for a part of your application that can leave with the constraints the currents state of the art have. Part of the application that don't mind about a lost of efficiency for example.
Communication between the transaction and non transactional parts is the big problem. There are STM that are lock aware, so them can interact in a consistent way with non transactional parts.
I/O is also possible, but your transaction becomes irrevocable, that is, can not be aborted. That means that only one transaction can use I/O at the same time. You can also use I/O once the top level transaction has succeed, on a non-transactional world, as now.
Most of the STM library base systems force the user to make the difference between transactional and non transactional data. So yes, you need to understand what this exactly means. On the other hand, compilers can deduce what access must be transactional or not, the problem been that they can be too conservative, decreasing the efficiency we can get when we manage explicitly the different kind of variables. This is the same as having static, local and dynamic variables. You need to know the constraints each one have to make a correct program.
I've been reading up a lot about transactional memory lately.
You might also be interested in this podcast on software transactional memory, which also introduces STM using an analogy based on garbage collection:
The paper is about an analogy between garbage collection and transactional memory.
In addition to seeing the beauty of the analogy, the discussion also serves as a good
introduction to transactional memory (which was mentioned in the Goetz/Holmes episode)
and - to some extent - to garbage collection.
If you use transactional memory as a replacement for locks, all the code that executes with that lock held could be rolled back upon completion. Thus the code that was previously using locks must be transactional, and will have all the same drawbacks (and benefits).
So, you could possibly restrict the influence of TM to only those parts of the code that hold locks, right? Every piece of code that can be called during a held lock must support TM, in that scenario. How much of your program does not hold locks and is never called by code that holds locks?

What is the best default transaction isolation level for an ERP, if any?

Short background: We are just starting to migrate/reimplement an ERP system to Java with Hibernate, targeting a concurrent user count of 50-100 users using the system. We use MS SQL Server as database server, which is good enough for this loads.
Now, the old system doesn't use any transactions at all and relies for critical parts (e.g. stock changes) on setting manual locks (using flags) and releasing them. That's something like manual transaction management. But there are sometimes problems with data inconsistency. In the new system we would like to use transactions to wipe out these problems.
Now the question: What would be a good/reasonable default transaction isolation level to use for an ERP system, given a usage of about 85% OLTP and 15% OLAP? Or should I always decide on a per task basis, which transaction level to use?
And as a reminder the four transaction isolation levels: READ UNCOMMITTED, READ COMMITTED, REPEATABLE READ, SERIALIZABLE
99 times out of 100, read committed is the right answer. That ensures that you only see changes that have been committed by the other session (and, thus, results that are consistent, assuming you've designed your transactions correctly). But it doesn't impose the locking overhead (particularly in non-Oracle databases) that repeatable read or serializable impose.
Very occasionally, you may want to run a report where you are willing to sacrifice accuracy for speed and set a read uncommitted isolation level. That's rarely a good idea, but it is occasionally a reasonably acceptable workaround to lock contention issues.
Serializable and repeatable read are occasionally used when you have a process that needs to see a consistent set of data over the entire run regardless of what other transactions are doing at the time. It may be appropriate to set a month-end reconciliation process to serializable, for example, if there is a lot of procedureal code, a possibility that users are going to be making changes while the process is running and a requirement that the process needs to ensure that it is always seeing the data as it existed at the time the reconciliation started.
Don't forget about SNAPSHOT, which is right below SERIALIZABLE.
It depends on how important it is for the data to be accurate in the reports. It really is a task-by-task thing.
It really depends a lot on how you design your application, the easy answer is just run at READ_COMMITTED.
You can make an argument that if you design your system with it in mind that you could use READ_UNCOMMITTED as the default and only increase the isolation level when you need it. The vast majority of your transactions are going to succeed anyway so reading uncommitted data won't be a big deal.
The way isolation levels effect your queries depends on your target database. For instance databases like Sybase and MSSQL must lock more resources when you run READ_COMMITTED, than databases like Oracle.
For SQL Server (and probably most major RDBMS), I'd stick with the default. For SQL Server, this is READ COMMITTED. Anything more and you start overtaxing the DB, anything less and you've got consistency issues.
Read Uncommitted is definitely the underdog in most forums. However, there are reasons to use it that go beyond a matter of "speed versus accuracy" that is often pointed out.
Let's say you have:
Transaction T1: Writes B, Reads A, (some more work), Commit.
Transaction T2: Writes A, Reads B, (some more work), Commit.
With read committed, the transactions above won't release until committing. Then you can run into a situation where T1 is waiting for T2 to release A, and T2 is waiting for T1 to release B. Here the two transactions collide in a lock.
You could re-write those procedures to avoid this scenario (example: acquire resources always in alphabetical order!). Still, with too many concurrent users and tens of thousands of lines of code, this problem may become both very likely and very difficult to diagnose and resolve.
The alternative is using Read Uncommitted. Then you design your transactions assuming that there may be dirty reads. I personally find this problem much more localized and treatable than the interlocking trainwrecks.
The issues from dirty reads can be preempted by
(1) Rollbacks: don't. This should be the last line of defense in case of hardware failure, network failure or program crash only.
(2) Use application locks to create a locking mechanism that operates
at a higher level of abstraction, where each lock is closer to a
real-world resource or action.

When do transactions become more of a burden than a benefit?

Transactional programming is, in this day and age, a staple in modern development. Concurrency and fault-tolerance are critical to an applications longevity and, rightly so, transactional logic has become easy to implement. As applications grow though, it seems that transactional code tends to become more and more burdensome on the scalability of the application, and when you bridge into distributed transactions and mirrored data sets the issues start to become very complicated. I'm curious what seems to be the point, in data size or application complexity, that transactions frequently start becoming the source of issues (causing timeouts, deadlocks, performance issues in mission critical code, etc) which are more bothersome to fix, troubleshoot or workaround than designing a data model that is more fault-tolerant in itself, or using other means to ensure data integrity. Also, what design patterns serve to minimize these impacts or make standard transactional logic obsolete or a non-issue?
--
EDIT: We've got some answers of reasonable quality so far, but I think I'll post an answer myself to bring up some of the things I've heard about to try to inspire some additional creativity; most of the responses I'm getting are pessimistic views of the problem.
Another important note is that not all dead-locks are a result of poorly coded procedures; sometimes there are mission critical operations that depend on similar resources in different orders, or complex joins in different queries that step on each other; this is an issue that can sometimes seem unavoidable, but I've been a part of reworking workflows to facilitate an execution order that is less likely to cause one.
I think no design pattern can solve this issue in itself. Good database design, good store procedure programming and especially learning how to keep your transactions short will ease most of the problems.
There is no 100% guaranteed method of not having problems though.
In basically every case I've seen in my career though, deadlocks and slowdowns were solved by fixing the stored procedures:
making sure all tables are accessed in order prevents deadlocks
fixing indexes and statistics makes everything faster (hence diminishes the chance of deadlock)
sometimes there was no real need of transactions, it just "looked" like it
sometimes transactions could be eliminated by making multiple statement stored procedures in single statement ones.
The use of shared resources is wrong in the long run. Because by reusing an existing environment you are creating more and more possibilities. Just review the busy beavers :) The way Erlang goes is the right way to produce fault-tolerant and easily verifiable systems.
But transactional memory is essential for many applications in widespread use. If you consult a bank with its millions of customers for example you can't just copy the data for the sake of efficiency.
I think monads are a cool concept to handle the difficult concept of changing state.
One approach I've heard of is to make a versioned insert only model where no updates ever occur. During selects the version is used to select only the latest rows. One downside I know of with this approach is that the database can get rather large very quickly.
I also know that some solutions, such as FogBugz, don't use enforced foreign keys, which I believe would also help mitigate some of these problems because the SQL query plan can lock linked tables during selects or updates even if no data is changing in them, and if it's a highly contended table that gets locked it can increase the chance of DeadLock or Timeout.
I don't know much about these approaches though since I've never used them, so I assume there are pros and cons to each that I'm not aware of, as well as some other techniques I've never heard about.
I've also been looking into some of the material from Carlo Pescio's recent post, which I've not had enough time to do it justice unfortunately, but the material seems very interesting.
If you are talking 'cloud computing' here, the answer would be to localize each transaction to the place where it happens in the cloud.
There is no need for the entire cloud to be consistent, as that would kill performance (as you noted). Simply, keep track of what is changed and where and handle multiple small transactions as changes propagate through the system.
The situation where user A updates record R and user B at the other end of cloud does not see it (yet) is the same as the one when user A didn't do the change yet in the current strict-transactional environment. This could lead to a discrepancy in an update-heavy system, so systems should be architectured to work with updates as less as possible - moving things to aggregation of data and pulling out the aggregates once the exact figure is critical (i.e. moving requirement for consistency from write-time to critical-read-time).
Well, just my POV. It's hard to conceive a system that is application agnostic in this case.
Try to make changes at the database level in the least number of possible instructions.
The general rule is to lock a resource the lest possible time. Using T-SQL, PLSQL, Java on Oracle or any similar way you can reduce the time that each transaction locks a shared resource. I fact transactions in the database are optimized with row-level locks, multi-version, and other kinds of intelligent techniques. If you can make the transaction at the database you save the network latency. Apart from other layers like ODBC/JDBC/OLEBD.
Sometimes the programmer tries to obtain the good things of a database ( It is transactional, parallel, distributed, ) but keep a caché of the data. Then they need to add manually some of the database features.