EclipseLink very slow on inserting data - eclipselink

I'm using the latest EclipseLink version with MySQL 5.5 (table type InnoDB). I'm inserting about 30900 records (which could be also more) at a time.
The problem is, that the insert performance is pretty poor: it takes about 22 seconds to insert all records (compared with JDBC: 7 seconds). I've read that using batch writing should help - but doesn't!?
#Entity
public class TestRecord {
#Id
#GeneratedValue(strategy = GenerationType.IDENTITY)
public Long id;
public int test;
}
The code to insert the records:
factory = Persistence.createEntityManagerFactory("xx_test");
EntityManager em = factory.createEntityManager();
em.getTransaction().begin();
for(int i = 0; i < 30900; i++) {
TestRecord record = new TestRecord();
record.test = 21;
em.persist(record);
}
em.getTransaction().commit();
em.close();
And finally my EclipseLink configuration:
<persistence-unit name="xx_test" transaction-type="RESOURCE_LOCAL">
<class>com.test.TestRecord</class>
<properties>
<property name="javax.persistence.jdbc.driver" value="com.mysql.jdbc.Driver" />
<property name="javax.persistence.jdbc.url" value="jdbc:mysql://localhost:3306/xx_test" />
<property name="javax.persistence.jdbc.user" value="root" />
<property name="javax.persistence.jdbc.password" value="test" />
<property name="eclipselink.jdbc.batch-writing" value="JDBC" />
<property name="eclipselink.jdbc.cache-statements" value="true"/>
<property name="eclipselink.ddl-generation.output-mode" value="both" />
<property name="eclipselink.ddl-generation" value="drop-and-create-tables" />
<property name="eclipselink.logging.level" value="INFO" />
</properties>
</persistence-unit>
What I'm doing wrong? I've tried several setting, but nothing seems the help.
Thanks in advance for helping me! :)
-Stefan
Another thing is to add ?rewriteBatchedStatements=true to the data URL used by the connector.
This caused executing about 120300 inserts down to about 30s which was about 60s before.

JDBC Batch writing improves performance drastically; please try it
Eg: property name="eclipselink.jdbc.batch-writing" value="JDBC"

#GeneratedValue(strategy = GenerationType.IDENTITY)
Switch to TABLE sequencing, IDENTITY is never recommended and a major performance issue.
See,
http://java-persistence-performance.blogspot.com/2011/06/how-to-improve-jpa-performance-by-1825.html
I seem to remember that MySQL may not support batch writing without some database config as well, there was another post on this, I forget the url but you could probably search for it.

Probably the biggest difference besides the mapping conversion is the caching. By default EclipseLink is placing each of the entities into the persistence context (EntityManager) and then during the finalization of the commit it needs to add them all to the cache.
One thing to try for now is:
measure how long an em.flush() call takes after the loop but before the commit. Then if you want you could call em.clear() after the flush so that the newly inserted entities are not merged into the cache.
Doug

Related

Is an XA transaction really atomic?

It seems that I'm not completely understand how an XA transaction works. I thought that it is atomic: I thought that when I commit a transaction, then new messages and new data will be available in the same time.
This misunderstanding led me to the following issue:
new rows are inserted to DB and a message is sent to a queue in a transactional route. In another route the message is received. Then this route tries to perform some manipulations with the rows that were inserted in the previous route. But it doesn't see them!
The second route is configured so it rolls back a message to the queue when an exception is happened. And I see that after a second run the route sees the rows!
As a conclusion I would ask the next questions:
Is an XA transaction really atomic?
If no, how can I configure commit order for my transactional resources?
Additional note: the issue is found in Fuse ESB/ServiceMix 4.4.1
2 Jake:
My camel context configuration looks like following:
<osgi:reference id="osgiPlatformTransactionManager" interface="org.springframework.transaction.PlatformTransactionManager"/>
<osgi:reference id="osgiJtaTransactionManager" interface="javax.transaction.TransactionManager"/>
<osgi:reference id="myDataSource"
interface="javax.sql.DataSource"
filter="(osgi.jndi.service.name=jdbc/postgresXADB)"/>
<bean id="PROPAGATION_MANDATORY" class="org.apache.camel.spring.spi.SpringTransactionPolicy">
<property name="transactionManager" ref="osgiPlatformTransactionManager"/>
<property name="propagationBehaviorName" value="PROPAGATION_MANDATORY"/>
</bean>
<bean id="PROPAGATION_REQUIRED" class="org.apache.camel.spring.spi.SpringTransactionPolicy">
<property name="transactionManager" ref="osgiPlatformTransactionManager"/>
<property name="propagationBehaviorName" value="PROPAGATION_REQUIRED"/>
</bean>
<bean id="jmstx" class="org.apache.activemq.camel.component.ActiveMQComponent">
<property name="configuration" ref="jmsTxConfig" />
</bean>
<bean id="jmsTxConfig" class="org.apache.camel.component.jms.JmsConfiguration">
<property name="connectionFactory" ref="jmsXaPoolConnectionFactory"/>
<property name="transactionManager" ref="osgiPlatformTransactionManager"/>
<property name="transacted" value="false"/>
<property name="cacheLevelName" value="CACHE_NONE"/>
<property name="concurrentConsumers" value="${jms.concurrentConsumers}" />
</bean>
<bean id="jmsXaPoolConnectionFactory" class="org.apache.activemq.pool.XaPooledConnectionFactory">
<property name="maxConnections" value="${jms.maxConnections}" />
<property name="connectionFactory" ref="jmsXaConnectionFactory" />
<property name="transactionManager" ref="osgiJtaTransactionManager" />
</bean>
<bean id="jmsXaConnectionFactory" class="org.apache.activemq.ActiveMQXAConnectionFactory">
<property name="brokerURL" value="${jms.broker.url}"/>
<property name="redeliveryPolicy">
<bean class="org.apache.activemq.RedeliveryPolicy">
<property name="maximumRedeliveries" value="-1"/>
<property name="initialRedeliveryDelay" value="2000" />
<property name="redeliveryDelay" value="5000" />
</bean>
</property>
</bean>
DB data source is configured as following:
<bean id="myDataSource" class="org.postgresql.xa.PGXADataSource">
<property name="serverName" value="${db.host}"/>
<property name="databaseName" value="${db.name}"/>
<property name="portNumber" value="${db.port}"/>
<property name="user" value="${db.user}"/>
<property name="password" value="${db.password}"/>
</bean>
<service ref="myDataSource" interface="javax.sql.XADataSource">
<service-properties>
<entry key="osgi.jndi.service.name" value="jdbc/postgresXADB"/>
<entry key="datasource" value="postgresXADB"/>
</service-properties>
</service>
I'm not an expert in this stuff, but my view would be that the atomicity that XA provides guaruntees only that:
Either the entire commit occurs or the entire commit rolls back.
That the whole commit/rollback completes before the commit request returns to whoever called it.
I don't think any guaruntee is made regarding the individual participants completing at the same instant, nor is there any kind of 'commit dependency tree' maintained guaranteeing that the subsequent processing only happens on participants who have committed.
I think to achieve what you want, you might need to put the message queue outside the main transaction... Which destroys the whole point of the transaction in the first place :(
I think you might just have to put a retry/timeout loop in your downstream processing. The alternative might be to explore the concurrency options to see if you an allow that downstream transaction to 'see' the upstream.
Hopefully this answer will prompt someone with more knowladge of this stuff to chip in!
For XA transactions, when you commit, then camel will run xa commit for each DB, ensuring that all XA transaction will be committed eventually, not simultaneously. Since the data in each DB is committed separately, it is not atomic, and it is impossible to make data modification across database atomic.
For your application, there may be two choice to avoid the problem.
Don't use XA transactions, instead use OutBox pattern. You can update part of the data, send message to queue, and then return. Any order dependent operation is put into queue, where you can easily custom the order.
Base on your XA solution, when you want to read the newest data, you call select for update, which will wait for the row lock holding by unfinished XA transaction. When XA transactions returned, select for update will return the newest data.
Your AMQ_SCHEDULED_DELAY header is a workaround, but not working when exceptions happen.

Expiration of NHibernate query cache

Is it possible to configure expiration of NHibernate's query cache?
For second level cache I can do it from nhibernate.cfg.xml, but I can't find a way for SQL query cache.
EDIT:
ICriteria query = CreateCriteria()
.Add(Expression.Eq("Email", identifiant))
.SetCacheable(true)
.SetCacheRegion("X");
<syscache>
<cache region="X" expiration="10" priority="1" />
</syscache>
Yes, we can set cache expiration via region. Adjust the query like this:
criteria.SetCacheable(true)
.SetCacheMode(CacheMode.Normal)
.SetCacheRegion("LongTerm");
And put similar configuration into web.config file
<configSections>
<section name="syscache" type="NHibernate.Caches.SysCache.SysCacheSectionHandler, NHibernate.Caches.SysCache" requirePermission="false" />
</configSections>
<syscache>
<cache region="LongTerm" expiration="180" priority="5" />
<cache region="ShortTerm" expiration="60" priority="3" />
</syscache>
EDIT: I am just adding this link Class-cache not used when getting entity by criteria
To be sure what I mean by SQL Query cache. In the linked answer I am explaining that topic
Just for a clarity. The configuration of the NHibernate "session-factory" must contain:
<property name="cache.use_query_cache">true</property>
This switch will make query cache working. More details: http://nhibernate.info/doc/nh/en/index.html#performance-querycache

Hibernate 4.1.2.FINAL Properties hbm2ddl.import_files don't seems to work

Hi I have a issue respect a hbm2ddl.import_files, it seems that don't work and not seems to appear in the log.
this is my configuration:
<property name="hibernateProperties">
<value>
hibernate.dialect=${hibernate.dialect}
hibernate.default_schema=${hibernate.default_schema}
hibernate.jdbc.batch_size=${hibernate.jdbc.batch_size}
hibernate.show_sql=${hibernate.show_sql}
hibernate.hbm2ddl.auto=${hibernate.hbm2ddl.auto}
hibernate.id.new_generator_mappings=${hibernate.id.new_generator_mappings}
hibernate.hbm2ddl.import_files=${hibernate.hbm2ddl.import_files}
<!-- Auto Generated Schemas and tables not good for production
hibernate.hbm2ddl.auto=update-->
</value>
</property>
the hibernate.hbm2ddl.import_files=/import.sql, and the file is:
insert into DEPARTAMENTO (NOMBRE_DEPART,REFERENCIA_DEPART) values ('AMAZONAS')
The jdbc.properties:
#org.hibernate.dialect.PostgreSQLDialect
hibernate.default_schema = "DBMERCANCIAS"
hibernate.show_sql = true
hibernate.id.new_generator_mappings = true
hibernate.hbm2ddl.auto = create
hibernate.jdbc.batch_size = 5
#Default the factory to use to instantiate transactions org.transaction.JDBCTransactionFactory
hibernate.transaction.factory_class=org.transaction.JDBCTransactionFactory
#Initialize values statements only on create-drop or create
hibernate.hbm2ddl.import_files = /import.sql
The database is postgresql 9.1.1, spring 3.1.0.RELEASE and hibernate 4.1.2.Final, the hibernate.hbm2ddl.auto is set to "create", the tables and the schema create but not run sql command insert why?, I can see in the log where this command run.
My error was the location in hibernate properties.
hibernate.hbm2ddl.import_files = /META-INF/spring/import.sql
is the correct location.
You could put import.sql in classpath(/classes/import.sql) and remove property hibernate.hbm2ddl.import_files from hibernate configuration/property.
NOTE: hibernate.hbm2ddl.auto must to create
<bean id="sessionFactory"
class="org.springframework.orm.hibernate4.LocalSessionFactoryBean">
<property name="dataSource" ref="dataSource" />
<property name="hibernateProperties">
<props>
<prop key="hibernate.dialect">${hibernate.dialect}</prop>
<prop key="hibernate.show_sql">${hibernate.show_sql}</prop>
<prop key="hibernate.hbm2ddl.auto">create</prop>
</property>
</bean>

Unable to bulk insert using NHibernate

I've tried adding bulk inserting to my application, but the Batcher is still NonBatchingBatcher with a BatchSize of 1.
This is using C#3, NH3RC1 and MySql 5.1
I've added this to my SessionFactory
<property name="adonet.batch_size">100</property>
And my code goes pretty much like this
var session = SessionManager.GetStatelessSession(type);
var tx = session.BeginTransaction();
session.Insert(instance);
I'm using HILO identity generation for the instances in question, but not for all instances on the database. The SessionFactory.OpenStatelessSession doesn't take a type, so it can't really know it can do batching on this type, or...?
After some digging into NHibernate, I found something in SettingsFactory.CreateBatcherFactory that might give some additional info
// It defaults to the NonBatchingBatcher
System.Type tBatcher = typeof (NonBatchingBatcherFactory);
// Environment.BatchStrategy == "adonet.factory_class", but I haven't
// defined this in my config file
string batcherClass = PropertiesHelper.GetString(Environment.BatchStrategy, properties, null);
if (string.IsNullOrEmpty(batcherClass))
{
if (batchSize > 0)
{
// MySqlDriver doesn't implement IEmbeddedBatcherFactoryProvider,
// so it still uses NonBatchingFactory
IEmbeddedBatcherFactoryProvider ebfp = connectionProvider.Driver as IEmbeddedBatcherFactoryProvider;
Could my configuration be wrong?
<hibernate-configuration xmlns="urn:nhibernate-configuration-2.2" >
<session-factory name="my application name">
<property name="adonet.batch_size">100</property>
<property name="connection.driver_class">NHibernate.Driver.MySqlDataDriver</property>
<property name="connection.connection_string">my connection string
</property>
<property name="dialect">NHibernate.Dialect.MySQL5Dialect</property>
<property name="proxyfactory.factory_class">NHibernate.ByteCode.Castle.ProxyFactoryFactory, NHibernate.ByteCode.Castle</property>
<!-- To avoid "The column 'Reserved Word' does not belong to the table : ReservedWords" -->
<property name="hbm2ddl.keywords">none</property>
</session-factory>
</hibernate-configuration>
I know this question is a year old, but there is a NuGet package that adds MySQL batching functionality to NHibernate. The reason that it's not baked directly into NHibernate is that the functionality required a reference to the MySQL.Data assembly, and the dev team didn't want the dependency.
IIRC, batching is currently supported for Oracle and SqlServer only.
As almost any other aspect of NH, this is extensible, so you can write your own IBatcher/IBatcherFactory and inject them via configuration.
Sidenote: current version of NH is 3.0 GA.
Really old question but...let's be completely correct
Another reason for batching not working can be use of stateless session (as in your case). Stateless session does not support batching. From documentation:
The insert(), update() and delete() operations defined by the
StatelessSession interface are considered to be direct database
row-level operations, which result in immediate execution of a SQL
INSERT, UPDATE or DELETE respectively. Thus, they have very different
semantics to the Save(), SaveOrUpdate() and Delete() operations
defined by the ISession interface.

What would cause NHibernate to return an invalid identity selection when using JET?

Our application (sadly) uses an MDB back-end database (I.e. JET engine).
One of the items being persisted to the database is an "event" object. The object is persisted to a table with an ID (EventLogID) that is an Autonumber field. The NHibernate mapping is as follows:
<class name="EventLogEntry" table="tblEventLog" proxy="IEventLogEntry">
<id name="Id">
<column name="EventLogID" not-null="true" />
<generator class="native" />
</id>
<property name="Source" column="ErrorLogSource" />
<property name="Text" column="EventLogText" />
<property name="Time" column="EventLogTime" />
<property name="User" column="UserID" />
<property name="Device" column="EventDeviceID" />
</class>
According to the log file, on some occasions when NHibernate attempts to obtain the identity, it receives the value "0". Later, when Flush is called, NHibernate suffers from an assertion failure.
Can anyone suggest why this might be happening? Better yet, can anyone suggest how to fix it?
Regards,
Richard
It could be that the default 'connection-release-mode' configuration setting is the cause of the problems.
A while ago, I ran into a similar issue, and I found that changing the connection.release-mode to 'on_close' (instead of the default after_transaction) solved the issue.
More information can be found on my blog
edit: as I'm thinking of it, perhaps it can be solved without changing the release-mode as well; what happens if you use a transaction to save your event ?
The default release-mode is after transaction, so I'm thinking; perhaps when you use an explicit transaction, the connection will only be closed after the transaction. The question offcourse is, will NHibernate try to retrieve the primary key that has been given to the object inside this transaction, or will it use another transaction ...
If it does not work, then changing the release-mode will solve your problem as well, but it is maybe not the best option.
I think the best option/solution, is to use an explicit transaction first, and see if this solves the problem...