JTA Transaction Support for Gemfire - gemfire

Can anyone please help me on below queries.
1>How to achieve 100% consistency between cache and database.if I want both Gemfire and database participate in JTA transaction as regular transactional resources (supported two-phase commit).?
2> Is "last resource" optimization guarantee 100% consistency?
3> What are the JTA transaction managers supported and tested with "last resource" optimization?
4> What are the external transaction managers which are supported and tested with Gemfire?

For a non-last resource JTA transaction GemFire registers as a synchronization, so it has a say in the outcome of the transaction, but if the GemFire server was to die within the small window (between the beforeCommit() and afterCommit() call, GemFire would be inconsistent with other resources in the transaction). When GemFire is used as a last-resource, it gets the final say in the outcome of the transaction, so this window is effectively closed.
You will need to consult the documentation of your JTA transaction Manager to see if they could guarantee 100% consistency.
In GemFire the last resource optimization is tested with weblogic, and GemFire works with JTA transaction managers that register with the JNDI with the following names:
"java:/TransactionManager"
"java:comp/TransactionManager"
"java:appserver/TransactionManager"
"java:pm/TransactionManager"
"java:comp/UserTransaction"

Related

How to handle data from an external, independent data source with Pivotal GemFire?

I am new to GemFire.
Currently we are using an MySQL DB and would like to move to GemFire.
How to move the existing data stored in MySQL over to GemFire? I.e., is there any way to to import existing MySQL data into GemFire?
There are many different options available for you to migrate data from 1 data store (e.g. an RDBMS like MySQL) to an IMDG (e.g. Pivotal GemFire). Pivotal GemFire does not provide any tools for this purpose OOTB.
However, you could...
A) Write a Spring Batch application to migrate all your data from MySQL to Pivotal GemFire in 1 large swoop. This is typical for most large-scale conversion processes, converting from 1 data store to another, either as part of an upgrade or a migration.
The advantage of using Pivotal GemFire as your target data store is that it stores Java Objects. So, if you are, say, using an ORM tool (e.g. Hibernate) to map the data stored in your MySQL database tables back to your application domain objects, you can then immediately and simply turnaround and store those same Objects directly into a corresponding Region in Pivotal GemFire. There is no additional mapping required to store an Object into GemFire.
Although, if you need something less immediate, then you can also...
B) Take advantage of Pivotal GemFire's CacheLoader, and maybe even the CacheWriter mechanisms. The CacheLoader and CacheWriter are implementations of the "Read-Through" and "Write-Through" design patterns.
More details of this approach can be found here.
In a nutshell, you implement a CacheLoader to load data from some external data source on Cache miss. You attach, or register the CacheLoader with a GemFire Region when the Region is created. When a Key (which can correspond to your MySQL Table Primary Key) is requested (Region.get(key)) and an entry does not exist, then GemFire will consult the CacheLoader to resolve the value, providing you actually registered a CacheLoader with the Region.
In this way, you slowly build up Pivotal GemFire from the MySQL RDBMS based on need.
Clearly, it is quite likely Pivotal GemFire will not be able to store all the data from your RDBMS in "memory". So, you can enable both Persistence and Overflow [to Disk] capabilities. By enabling Persistence, GemFire will load the data from it's own DiskStores the next time the nodes come online, assuming you brought them down prior.
The CacheWriter mechanism is nice if you want to run both Pivotal GemFire and MySQL in parallel for while, until you can shift enough of the responsibilities of MySQL over to GemFire, for instance. The CacheWriter will write back to your underlying MySQL DB each time an entry is written or updated in the GemFire Region. You can even do this asynchronously (i.e. "Write-Behind") using GemFire's AsyncEventQueues and Listeners; see here.
Obviously, you many options at your disposal. You need to carefully way your options and choose an approach that best meets your application requirements and needs.
If you have additional questions, let me know.

What is the difference between JTA and a local transaction?

What is the difference between JTA and a local transaction?
An example that shows when to use JTA and when to use a local transaction would be great.
JTA is a general API for managing transactions in Java. It allows you to start, commit and rollback transactions in a resource neutral way. Transactional status is typically stored in TLS (Thread Local Storage) and can be propagated to other methods in a call-stack without needing some explicit context object to be passed around. Transactional resources can join the ongoing transaction. If there is more than one resource participating in such a transaction, at least one of them has to be a so-called XA resource.
A resource local transaction is a transaction that you have with a specific single resource using its own specific API. Such a transaction typically does not propagate to other methods in a call-stack and you are required to pass some explicit context object around. In the majority of the resource local transactions it's not possible to have multiple resources participating in the same transaction.
You would use a resource local transaction in for instance low-level JDBC code in Java SE. Here the context object is expressed by an instance of java.sql.Connection. Other examples of resource local transactions are developers creating enterprise applications around 2002. Since transaction managers (used by JTA) were expensive, closed source and complicated things to setup around that era, people went with the cheaper and easier to obtain resource local variants.
You would use a JTA transaction in basically every other scenario. Very simple, small, free and open-source servers like TomEE (25MB) or GlassFish (35MB) have JTA support out of the box. There's nothing to setup and they Just Work.
Finally, technologies like EJB and Spring make even JTA easier to use by offering declarative transactions. In most cases it's advised to use those as they are easier, cleaner and less error prone. Both EJB and Spring can use JTA under the covers.
Transaction-type should be set to "RESOURCE_LOCAL" for Java SE application and to "JTA" for Java EE application.
"RESOURCE_LOCAL" may work fine on some web application deployed on Tomcat, but may cause issues when you run your application under glassfish environment.
If you are working on distributed transactions you must use "JTA" as your transaction manager.
The Java Transaction API (JTA) is one of the Java Enterprise Edition (Java EE) APIs allowing distributed transactions to be done across multiple XA resources in a Java environment.
J2EE application includes suppoart fot DT through 2 specifications
JTA--->Java Transaction API.highe-level implementation and is always enabled
JTS--->Java Transaction Service.

Using XADatasource or non-XA Datasource for JTA based transactionsn in JPA

We are using JPA 1.0 for ORM based operations and we want to have JTA datasource for our application. We are having only 1 database to which our application will connect.
We start our transaction boundary in controller class and it goes till DAO layer controller--> BOImpl--> DAO.
In websphere application server admin console when I am defining datasource should I use non-XA datasource or XA-Datasource.
My understanding is that for single datasource I should not use XADatasource.
Please let me know what should I need to use.
For a single resource (like a single DB) you indeed do not need an XA-datasource.
On the other hand, bear in mind that most JTA/JTS implementations actually recognize that there is only 1 resource participating in a transactions, so the overhead for XA would be minimal or none then. There can also be additional participants in the transactions that you might now not think about, like sending JMS messages.
But if you're really sure you only have 1 resource participating, you can safely go for non-XA.
I hope your doubt may be clear by now, but here is more information on that just in case.
The typical XA resources are databases, messaging queuing products such as JMS or WebSphere MQ, mainframe applications, ERP packages, or anything else that can be coordinated with the transaction manager. XA is used to coordinate what is commonly called a two-phase commit (2PC) transaction. The classic example of a 2PC transaction is when two different databases need to be updated atomically. Most people think of something like a bank that has one database for savings accounts and a different one for checking accounts. If a customer wants to transfer money between his checking and savings accounts, both databases have to participate in the transaction or the bank risks losing track of some money.
The problem is that most developers think, "Well, my application uses only one database, so I don't need to use XA on that database." This may not be true. The question that should be asked is, "Does the application require shared access to multiple resources that need to ensure the integrity of the transaction being performed?" For instance, does the application use Java 2 Connector Architecture adapters or the Java Message Service (JMS)? If the application needs to update the database and any of these other resources in the same transaction, then both the database and the other resource need to be treated as XA resources.

nhibernate transaction deadlock issue

I am currently facing transaction deadlock issues in Nhibernate data layer. The scenario is :
I have a large transaction table T1. This table undergoes frequent write/update operations. Also, a service frequently reads this table(based on filter) and merges the data with client cache(on client machine). The frequency of read is very high.
At times (does not follow pattern) , the deadlock issue surfaces.
How can I trace this issue?(I have dbo role). Is there any Nhibernate setting that can help in this context?
As your service only reads the table, you could change your application to use an optimistic lock approach.
More about optimistic locking with NH: http://www.google.com.br/url?sa=t&source=web&cd=5&ved=0CDMQFjAE&url=http%3A%2F%2Fknol.google.com%2Fk%2Ffabio-maulo%2Fnhibernate-chapter-5-basic-o-r-mapping%2F1nr4enxv3dpeq%2F8&ei=biB9TLjMNIL88AbQ4smNBw&usg=AFQjCNHpZ80cpxa6IqzxILfQU9XACQjbYA&sig2=md9f4mYYnvFPowB-RZbuag.
Filipe

Using IsolationLevel.Snapshot but DB is still locking

I'm part of a team building an ADO.NET based web-site. We sometimes have several developers and an automated testing tool working simultaneously a development copy of the database.
We use snapshot isolation level, which, to the best of my knowledge, uses optimistic concurrency: rather than locking, it hopes for the best and throws an exception if you try to commit a transaction if the affected rows have been altered by another party during the transaction.
To use snapshot isolation level we use:
ALTER DATABASE <database name>
SET ALLOW_SNAPSHOT_ISOLATION ON;
and in C#:
Transaction = SqlConnection.BeginTransaction(IsolationLevel.Snapshot);
Note that IsolationLevel Snapshot isn't the same as ReadCommitted Snapshot, which we've also tried, but are not currently using.
When one of the developers enters debug mode and pauses the .NET app, they will hold a connection with an active transaction while debugging. Now, I'd expect this not to be a problem - after all, all transactions are using snapshot isolation level, so while one transaction is paused, other transactions should be able to proceed normally since the paused transaction isn't holding any locks. Of course, when the paused transaction completes, it is likely to detect a conflict; but that's acceptable so long as other developers and the automated tests can proceed unhindered.
However, in practice, when one person halts a transaction while debugging, all other DB users attempting to access the same rows are blocked despite using snapshot isolation level.
Does anybody know why this occurs, and/or how I can achieve true optimistic (non-blocking) concurrency?
The resolution (an unfortunate one for me): Remus Rusanu noted that writers always block other writers; this is backed up by MSDN - it doesn't quite come out and say so, but only ever mentions avoiding reader-writer locks. In short, the behavior I want isn't implemented in SQL Server.
SNAPSHOT isolation level affects, like all isolation levels, only reads. Writes are still blocking each other. If you believe that what you see are read blocks, then you should investigate further and check out the resource types and resource names on which blocking occurs (wait_type and wait_resource in sys.dm_exec_requests).
I wouldn't advise in making code changes in order to support a scenario that involves developers staring at debugger for minutes on end. If you believe that this scenario can repeat in production (ie. client hangs) then is a different story. To achieve what you want you must minimize writes and perform all writes at the end of transaction, in one single call that commits before return. This way no client can hold X locks for a long time (cannot hang while holding X locks). In practice this is pretty hard to pull off and requires a lot of discipline on the part of developers in how they write the data access code.
Have you looked at the locks when one developer pauses the transaction? Also, just turning on snapshot isolation level does not have much effect. Have you set ALLOW_SNAPSHOT_ISOLATION ON?
Here are the steps:
ALTER DATABASE <databasename>
SET READ_COMMITTED_SNAPSHOT ON;
GO
ALTER DATABASE <database name>
SET ALLOW_SNAPSHOT_ISOLATION ON;
GO
After the database has been enabled for snapshot isolation, developers and users must then request that their transactions be run in this snapshot mode. This must be done before starting a transaction, either by a client-side directive on the ADO.NET transaction object or within their Transact-SQL query by using the following statement:
SET TRANSACTION ISOLATION LEVEL SNAPSHOT
Raj