I am trying to do batch updates using NHibernate, but it is not doing batch updates, its doing individual writes for all the rows. I have to write around 10k rows to db.
using (var session = GetSessionFactory().OpenStatelessSession())
{
session.SetBatchSize(100);
using (var tx = session.BeginTransaction())
{
foreach (var pincode in list)
{
session.Update(pincode);
}
tx.Commit();
}
}
I am tried setting batch size to 100 using session.SetBatchSize(100); but that does not help. Also tried setting batch size using cfg.SetProperty("adonet.batch_size", "100"); but thats also not helping.
I am using GUID primary keys, hence I dont understand the reason for batch update failure. This is exactly the solution explained here. But its not working for me.
NOTE I have version field for optimistic concurrency mapped on all the entities. can that be the culprit for not having batch updates??
EDIT
i tried using state-ful session but that also did not help
//example 2
using (var session = GetSessionFactory().OpenSession())
{
session.SetBatchSize(100);
session.FlushMode = FlushMode.Commit;
foreach (var pincode in list)
{
session.Update(pincode);
}
session.Flush();
}
//example 3
using (var session = GetSessionFactory().OpenSession())
{
session.SetBatchSize(100);
using (var tx = session.BeginTransaction())
{
foreach (var pincode in list)
{
session.Update(pincode);
}
tx.Commit();
}
}
example 2 for some reason is causing double round trips.
EDIT
after further research I found that, each session.Update is actually updating the db
using (var session = SessionManager.GetStatelessSession())
{
session.SetBatchSize(100);
foreach (var record in list)
{
session.Update(record);
}
}
how can I avoid that.
EDIT
tried with flush mode as well, but thats also not helping
using (var session = SessionManager.GetNewSession())
{
session.FlushMode = FlushMode.Never;
session.SetBatchSize(100);
session.BeginTransaction();
foreach (var pincode in list)
{
session.SaveOrUpdate(pincode);
}
session.Flush();
session.Transaction.Commit();
}
EDIT 4
even below one is not working, given i am fetching all entities in same session and updating and saving them in that session only...
using (var session = SessionManager.GetSessionFactory().OpenSession())
{
session.SetBatchSize(100);
session.FlushMode = FlushMode.Commit;
session.Transaction.Begin();
var list = session.QueryOver<Pincode>().Take(1000).List();
list.ForEach(x => x.Area = "Abcd" + DateTime.Now.ToString("HHmmssfff"));
foreach (var pincode in list) session.SaveOrUpdate(pincode);
session.Flush();
session.Transaction.Commit();
}
You are using a stateless session. Since a stateless session has no state, it cannot remember anything to do later. Hence the update is executed immediately.
nhibernate does not batch versioned entities that was the issue in my case.
There is no way you can batch version entities, the only to do this is to make the entity non versioned.
Note that:
Batches are not visible in Sql Server Profiler. Do not depend on that.
When inserting using identity (or native) id generators, NH turns off ado.net batch size.
Additional notes:
make sure that you do not have a query for each changed entity, because it flushes before queries.
You probably should not call session.Update. In the best case, it doesn't do anything. In worst case, it really does updating thus breaking batching.
When doing having many objects in the session, don't forget to care about flushes and flush time. Sometimes flushing is more time consuming than updating. NH flushes before commit, when you call flush and before queries, unless you turned it off or you use a stateless session. Make sure that you only flush once.
Related
This code always fails with a ConcurrencyException:
[Test]
public void EventOrderingCode_Fails_WithConcurrencyException()
{
Guid id = Guid.NewGuid();
using (var scope1 = new TransactionScope())
using (var session = DataAccess.NewOpenSession)
{
session.Advanced.UseOptimisticConcurrency = true;
session.Advanced.AllowNonAuthoritativeInformation = false;
var ent1 = new CTEntity
{
Id = id,
Name = "George"
};
using (var scope2 = new TransactionScope(TransactionScopeOption.RequiresNew))
{
session.Store(ent1);
session.SaveChanges();
scope2.Complete();
}
var ent2 = session.Load<CTEntity>(id);
ent2.Name = "Gina";
session.SaveChanges();
scope1.Complete();
}
}
It fails at the last session.SaveChanges. Stating that it is using a NonCurrent etag. If I use Required instead of RequiresNew for scope2 - i.e. using the same Transaction. It works.
Now, since I load the entity (ent2) it should be using the newest Etag unless this is some cached value attached to scope1 that I am using (but I have disabled Caching). So I do not understand why this fails.
I really need this setup. In the production code the outer TransactionScope is created by NServiceBus, and the inner is for controlling an aspect of event ordering. It cannot be the same Transaction.
And I need the optimistic concurrency too - if other threads uses the entity at the same time.
BTW: This is using Raven 2.0.3.0
Since no one else have answered, I had better give it a go myself.
It turns out this was a human error. Due to a bad configuration of our IOC container the DataAccess.NewOpenSession gave me the same Session all the time (across other tests). In other words Raven works as expected :)
Before I found out about this I also experimented with using TransactionScopeOption.Suppress instead of RequiresNew. That also worked. Then I just had to make sure that whatever I did in the suppressed scope could not fail. Which was a valid option in my case.
Within a single transaction scope, I am trying to save data to two separate Raven databases (on the same Raven server).
However, changes in the second session are not being saved to the server.
Here is the code I am using:
var documentStore = new DocumentStore { Url = "http://localhost:8081/" };
documentStore.Initialize();
documentStore.DatabaseCommands.EnsureDatabaseExists("database1");
documentStore.DatabaseCommands.EnsureDatabaseExists("database2");
using (var ts = new TransactionScope(TransactionScopeOption.RequiresNew))
{
using (var session1 = documentStore.OpenSession("database1"))
{
session1.Store(new Entity {Id = "1"});
session1.SaveChanges();
}
using (var session2 = documentStore.OpenSession("database2"))
{
session2.Store(new Entity {Id = "2"});
session2.SaveChanges();
}
ts.Complete();
}
I can see changes in session1 being saved to database1 but changes in session2 are not saved to database2. I've tried with RavenDb build 992 and build 2360
According to the Raven documentation, transactions across databases are supported.
How can I get changes from session2 to be committed?
The distributed transaction needs time to commit. You can either wait briefly with Thread.Sleep before checking your result, or you can set a special property in the session that is checking the result:
session.Advanced.AllowNonAuthoritativeInformation = false;
This seems like a bug, but it is actually by design. Read Working with System.Transactions in the RavenDB documentation.
(FYI - I didn't know this either until I checked.)
The following method queries my database, using a new session. If the query succeeds, it attaches (via "Lock") the result to a "MainSession" that is used to support lazy loading from a databound WinForms grid control.
If the result is already in the MainSession, I get the exception:
NHibernate.NonUniqueObjectException : a different object with the same identifier value was already associated with the session: 1, of entity: BI_OverlordDlsAppCore.OfeDlsMeasurement
when I attempt to re-attach, using the Lock method.
This happens even though I evict the result from the MainSession before I attempt to re-attach it.
I've used the same approach when I update a result, and it works fine.
Can anyone explain why this is happening?
How should I go about debugging this problem?
public static OfeMeasurementBase GetExistingMeasurement(OverlordAppType appType, DateTime startDateTime, short runNumber, short revision)
{
OfeMeasurementBase measurement;
var mainSession = GetMainSession();
using (var session = _sessionFactory.OpenSession())
using (var transaction = session.BeginTransaction())
{
// Get measurement that matches params
measurement =
session.CreateCriteria(typeof(OfeMeasurementBase))
.Add(Expression.Eq("AppType", appType))
.Add(Expression.Eq("StartDateTime", startDateTime))
.Add(Expression.Eq("RunNumber", runNumber))
.Add(Expression.Eq("Revision", revision))
.UniqueResult() as OfeMeasurementBase;
// Need to evict from main session, to prevent potential
// NonUniqueObjectException if it's already in the main session
mainSession.Evict(measurement);
// Can't be attached to two sessions at once
session.Evict(measurement);
// Re-attach to main session
// Still throws NonUniqueObjectException!!!
mainSession.Lock(measurement, LockMode.None);
transaction.Commit();
}
return measurement;
}
I resolved the problem after finding this Ayende post on Cross Session Operations.
The solution was to use ISession.Merge to get the detached measurement updated in the main session:
public static OfeMeasurementBase GetExistingMeasurement(OverlordAppType appType, DateTime startDateTime, short runNumber, short revision)
{
OfeMeasurementBase measurement;
var mainSession = GetMainSession();
using (var session = _sessionFactory.OpenSession())
using (var transaction = session.BeginTransaction())
{
// Get measurement that matches params
measurement =
session.CreateCriteria(typeof(OfeMeasurementBase))
.Add(Expression.Eq("AppType", appType))
.Add(Expression.Eq("StartDateTime", startDateTime))
.Add(Expression.Eq("RunNumber", runNumber))
.Add(Expression.Eq("Revision", revision))
.UniqueResult() as OfeMeasurementBase;
transaction.Commit();
if (measurement == null) return null;
// Merge back into main session, in case it has changed since main session was
// originally loaded
var mergedMeasurement = (OfeMeasurementBase)mainSession.Merge(measurement);
return mergedMeasurement;
}
}
I am trying to add about 21,000 entities already in the database into an nhibernate-search Lucene index. When done, the indexes are around 12 megabytes. I think the time can vary quite a bit, but it's always very slow. In my last run (running with the debugger), it took over 12 minutes to index the data.
private void IndexProducts(ISessionFactory sessionFactory)
{
using (var hibernateSession = sessionFactory.GetCurrentSession())
using (var luceneSession = Search.CreateFullTextSession(hibernateSession))
{
var tx = luceneSession.BeginTransaction();
foreach (var prod in hibernateSession.Query<Product>())
{
luceneSession.Index(prod);
hibernateSession.Evict(prod);
}
hibernateSession.Clear();
tx.Commit();
}
}
The vast majority of the time is spent in tx.Commit(). From what I've read of Hibernate search, this is to be expected. I've come across quite a few ways to help, such as MassIndexer, flushToIndexes, batch modes, etc. But as far as I can tell these are Java-only options.
The session clear and evict are just desperate moves by me - I haven't seen them make a difference one way or another.
Has anyone had success quickly indexing a large amount of existing data?
I've been able to speed up considerable indexing by using a combination of batching and transactions.
My initial code took ~30 minutes to index ~20.000 entities. Using the code bellow I've got it down to ~4 minutes.
private void IndexEntities<TEntity>(IFullTextSession session) where TEntity : class
{
var currentIndex = 0;
const int batchSize = 500;
while (true)
{
var entities = session
.CreateCriteria<TEntity>()
.SetFirstResult(currentIndex)
.SetMaxResults(batchSize)
.List();
using (var tx = session.BeginTransaction())
{
foreach (var entity in entities)
{
session.Index(entity);
}
currentIndex += batchSize;
session.Flush();
tx.Commit();
session.Clear();
}
if (entities.Count < batchSize)
break;
}
}
It depends on lucene options you can set. See this page and check if nhibernate-search has wrappers for these options. If it doesn't, modify its source.
Considering the following code blocks, why does call to HQL work but call to delete() not work? As a background, I'm using NHibernate over IBM.Data.DB2.Iseries driver. Come to find out, journaling on the AS400 is turned off so I can't use transactions. I'm not the AS400 admin or know anything about it so I don't know if having journaling turned off (not opening transactions) is causing this problem or not. Do I absolutely need the ability to open transactions if I'm calling Delete() or other NHibernate functions?
//This Does not work - no error and no deletes
public static void Delete(Object Entity)
{
using (ISession session = _sessionFactory.OpenSession())
{
//using(ITransaction tx = session.BeginTransaction())
//{
session.Delete(Entity);
//tx.Commit();
session.Close();
//}
}
}
//This does work
public static void Delete(Object Entity)
{
using (ISession session = _sessionFactory.OpenSession())
{
//commented out transaction control because throws error because
//journaling is not turned on the AS400
//using(ITransaction tx = session.BeginTransaction())
//{
session.CreateQuery("delete MyDAO p where p.MyDAOID = :MyDAOID").SetString("MyDAOID", ((MyDAO)Entity).MyDAOID.ToString()).ExecuteUpdate();
//}
}
}
Try calling session.Flush() after you delete, but before you close the session. And you don't need to explicitly close the session:
using (var session = ...)
{
entity.Delete();
session.Flush();
}
After delving further into this, I found that this works but why?:
public static void Delete(Object Entity)
{
using (ISession session = _sessionFactory.OpenSession())
{
MyDAO p = session.Get<MyDAO>(Entity.ID);
session.Delete(p);
session.Flush();
}
}
Here is my scenario: I have previously queried for a list of objects which I populated into a Grid. For that query process, I opened/closed a session. Upon calling my delete function, I take that object and pass it into this function for a separate open/close session process. To stir things up a bit, I added the call to this new delete asking it to get the object first (the one I want deleted), then call delete. Now it sends the delete query to the database and the object is in fact deleted. Why is this?
Is this because my original query of objects was opened within a different session and then deleted in another? Does this have anything to do with the unit of work. Should I open my session for my grid, leave it open until my form closes so all deletes work inside that session? I was under the impression that sessions should be opened/closed quickly. I'm confused.