NHibernate + concurrent edits: How to get notified of changes? - nhibernate

I'm looking into using NHibernate to handle the persistence layer in my business application. Right now, I'm wondering how to handle concurrent edits/updates in NHibernate, or more specifically, how to let multiple, distributed instances of my application know that one user changed a given dataset.
I'm not looking for versioning, i.e. consistency in the face of concurrent edits - I know NHibernate supports optimistic/pessimistic concurrency, and I currently use optimistic concurrency via and handling the StateStateException.
I just want to know: Given that user A changes a row in the dataset, how to let user B know that a change occured so that the dataset can be reloaded? Right now, I'm lazy loading a large list of customers into a grid using NHibernate. Now, if a new customer is added, I'd like to add it to the grid without reloading the whole data again - the application's first concern is performance.
Also, will NHibernate gracefully handle changes to existing rows? Will it somehow detect changes in the underlying database and update the in-memory .NET objects so that accessing their properties will yield the updated values?
I thought about using an additional table, saving the IDs of updated objects along with a timestamp to refresh items myself, but if NHinbernate offers something of it's own, that would be a much bett choice, obviously...

You need database-level notifications/events for this. That is, the database engine has to provide notifications. For example, SQL Server has this feature. NHibernate runs on the application, so all it could potentially do by itself is polling the database for changes (your idea of using an additional table looks like polling), and we all know that's not good. NHibernate's SysCache2 takes advantage of SqlCacheDependencies to invalidate cache entries when the database raises a notification, but I'm not aware that it can raise any events to be consumed by the app itself (which wouldn't be useful anyway, since 2nd-level caches don't work with whole entities).
Another possible way to implement this would be having a NHibernate event listener place a broadcast message on a bus after any updates, then each application instance would receive this message and re-fetch from database accordingly.

I had to solve some similar issues on my current SQLite Fluent NHibernate project.
For detecting database updates, I used a System.IO.FileSystemWatcher. See the sample code in my answer to my own, similar, question. Also, take a look at another answer to the same question, which suggested using NHibernate Interceptors.
I also lazy load my DB into a grid. Whenever the FileSystemWatcher detects the DB has been updated, I call the following Criteria Query, and add the new records that it returns to the BindingList that is the DataSource for my grid. (I just save the highest ID of the new records in a private variable)
/// <summary>
/// Return list of measurements with Id fields higher than id parameter
/// </summary>
/// <param name="id"></param>
/// <returns></returns>
public static IList<T> GetMeasurementsNewerThan(int id)
{
var session = GetMainSession();
using (var transaction = session.BeginTransaction())
{
var newestMeasurements =
session.CreateCriteria(typeof(T))
.Add(Expression.Gt("Id", id))
.List<T>();
transaction.Commit();
return newestMeasurements;
}
}

Related

How Can a Data Access Object (DAO) Allow Simultaneous Updates to a Subset of Columns?

Please forgive me if I misuse any OOP terminology as I'm still getting my feet wet on the subject.
I've been reading up on object oriented programming (OOP) - specifically for web applications. I have been going over the concept of a data access object (DAO). The DAO is responsible for CRUD (Create, Read, Update, and Delete) methods and connecting your application's service (business logic) layer to the database.
My question specifically pertains to the Update() method within a DAO. In the examples I've read about, developers typically pass a bean object into the DAO update() method as its main argument updateCustomer(customerBean) The method then executes some SQL which updates all of the columns based on the data in the bean.
The problem I see with this logic is that the update() method updates ALL columns within the database based on the bean's data and could theoretically cause it to overwrite columns another user or system might need to update simultaneously.
A simplified example might be:
User 1 updates field A in the bean
User 2 updates field B in the bean
User 2 passes bean to DAO, DAO updates all fields.
User 1 passes bean to DAO, DAO updates all fields.
User 2's changes have been lost!
I've read about Optimistic Locking and Pessimistic Locking as possible solutions for only allowing one update at a time but I can think of many cases where an application needs to allow for editing different parts of a record at the same time without locking or throwing an error.
For example, lets say an administrator is updating a customer's lastName at the same time the customer logs into the web site and the login system needs to update the dateLastLoggedIn column while simultaneously a scheduled task needs to update a lastPaymentReminderDate. In this crazy example, if you were passing a bean object to the update() method and saving the entire record of data each time its possible that whichever process runs the update() method last would overwrite all of the data.
Surely there must be a way to solve this. I've come up with a few possibilities based on my research but I would be curious to know the proper/best way to accomplish this.
Possible solution 1: DAO Update() Method Does Not Accept Bean as Argument
If the update() method accepted a structure of data containing all of the columns in the database that need updating instead of a bean object you could make your SQL statement smart enough to only update the fields that were passed to the method. For example, the argument might look like this:
{
customerID: 1,
firstName: 'John'
}
This would basically tell the update() method to only update the column firstName based on the customerID, 1. This would make your DAO extremely flexible and would give the service layer the ability to dynamically interact with the database. I have a gut feeling that this violates some "golden rule" of OOP but I'm not sure which. I've also never seen any examples online of a DAO behaving like this.
Possible Solution 2: Add additional update() methods to your DAO.
You could also solve this by adding more specific update() methods to your DAO. For example you might have one for dateLastLoggedIn()' and 'dateLastPaymentReminderDate(). This way each service that needs to update the record could theoretically do so simultaneously. Any locking could be done for each specific update method if needed.
The main downside of this approach is that your DAO will start to get pretty muddy with all kinds of update statements and I've seen many blog posts writing about how messy DAOs can quickly become.
How would you solve this type of conundrum with DAO objects assuming you need to allow for updating subsets of record data simultaneously? Would you stick with passing a bean to the DAO or is there some other solution I haven't considered?
If you do a DAO.read() operation that returns a bean, then update the bean with the user's new values, then pass that bean to the DAO.update(bean) method, then you shouldn't have a problem unless the two user operations happen within milliseconds of each other. Your question implies that the beans are being stored in the session scope or something like that before passed to the update() method. If that's what you're doing, don't, for exactly the reasons you described. You don't want your bean getting out of sync with the db record. For even better security, wrap a transaction around the read and update operations, then there'd be no way the two users could step on each other's toes, even if user2 submits his changes at the exact same time as user 1.
Read(), set values, update() is the way to go, I think. Keep the beans fresh. Nobody wants stale beans.

Repeating a query does not refresh the properties of the returned objects

When doing a criteria query with NHibernate, I want to get fresh results and not old ones from a cache.
The process is basically:
Query persistent objects into NHibernate application.
Change database entries externally (another program, manual edit in SSMS / MSSQL etc.).
Query persistence objects (with same query code), previously loaded objects shall be refreshed from database.
Here's the code (slightly changed object names):
public IOrder GetOrderByOrderId(int orderId)
{
...
IList result;
var query =
session.CreateCriteria(typeof(Order))
.SetFetchMode("Products", FetchMode.Eager)
.SetFetchMode("Customer", FetchMode.Eager)
.SetFetchMode("OrderItems", FetchMode.Eager)
.Add(Restrictions.Eq("OrderId", orderId));
query.SetCacheMode(CacheMode.Ignore);
query.SetCacheable(false);
result = query.List();
...
}
The SetCacheMode and SetCacheable have been added by me to disable the cache. Also, the NHibernate factory is set up with config parameter UseQueryCache=false:
Cfg.SetProperty(NHibernate.Cfg.Environment.UseQueryCache, "false");
No matter what I do, including Put/Refresh cache modes, for query or session: NHibernate keeps returning me outdated objects the second time the query is called, without the externally committed changes. Info btw.: the outdated value in this case is the value of a Version column (to test if a stale object state can be detected before saving). But I need fresh query results for multiple reasons!
NHibernate even generates an SQL query, but it is never used for the values returned.
Keeping the sessions open is neccessary to do dynamic updates on dirty columns only (also no stateless sessions for solution!); I don't want to add Clear(), Evict() or such everywhere in code, especially since the query is on a lower level and doesn't remember the objects previously loaded. Pessimistic locking would kill performance (multi-user environment!)
Is there any way to force NHibernate, by configuration, to send queries directly to the DB and get fresh results, not using unwanted caching functions?
First of all: this doesn't have anything to do with second-level caching (which is what SetCacheMode and SetCacheable control). Even if it did, those control caching of the query, not caching of the returned entities.
When an object has already been loaded into the current session (also called "first-level cache" by some people, although it's not a cache but an Identity Map), querying it again from the DB using any method will never override its value.
This is by design and there are good reasons for it behaving this way.
If you need to update potentially changed values in multiple records with a query, you will have to Evict them previously.
Alternatively, you might want to read about Stateless Sessions.
Is this code running in a transaction? Or is that external process running in a transaction? If one of those two is still in a transaction, you will not see any updates.
If that is not the case, you might be able to find the problem in the log messages that NHibernate is creating. These are very informative and will always tell you exactly what it is doing.
Keeping the sessions open is neccessary to do dynamic updates on dirty columns only
This is either the problem or it will become a problem in the future. NHibernate is doing all it can to make your life better, but you are trying to do as much as possible to prevent NHibernate to do it's job properly.
If you want NHibernate to update the dirty columns only, you could look at the dynamic-update-attribute in your class mapping file.

Why evicting objects from Session does not commit changes to database?

I am using NHibernate in my application and have the following situation. (Note that the situation here is much more simplified for the sake of understanding)
An object is queried from the database. It is edited and updated using Session.Update(). The object is then Evicted from the Session (using Session.Evict(obj)) before the transaction is committed. Expected result is that the changes are persisted to the database.
This used to work fine when I had my Id column as a NHibernate identity column.
Recently, we changed the Id column to be non-identity. As a result, the above scenario does not persist to the database unless I explicitly call Session.Flush() before I Evict.
Can someone point/explain to me the reason for this behavior?
This link, NHibernate Session.Flush & Evict vs Clear, mentions something about Evict and the identity column, which to me is not very clear.
Your workflow is not correct.
First, when you retrieve an object from the database, session.Update(entity) does not do anything. Changes will happen automatically on Flush/Commit
Next, Evict removes all knowledge the session has of the object, therefore it will not persist any changes applied to it. You should almost never use this method under normal conditions, which makes me think you are not handling the session correctly.
Third, the fact that using identity causes inserts to happen immediately on Save is a limitation, not a feature.
The correct workflow is:
using (var session = factory.OpenSession())
using (var transaction = session.BeginTransaction())
{
var entity = session.Get<EntityType>(id);
entity.SomeProperty = newValue;
transaction.Commit();
}
The exact structure (using statements, etc) can change in a desktop application, but the basic idea is the same.
Identity forces NHibernate to immediatly save the entity to the database on session.Save() non identity allows it to batch the inserts to send them as a whole. Evict will remove all information of the object from the session. So while with identity it forgets about the entity it is already in the database even if the session doesnt know.
to remedy that you can
set Flushmode.Auto (forces immediate flushing)
call session.Flush() before Evict
Evict after the transaction completed
which is the best option depends on the context

Nhibernate transaction level during persist help

I have a question. Imagine you'd have an object you'd want to save in a transaction, the object having collections of other objects etc so its a more "complex" object.
Anyway, sometimes we save objects like that, but in the meantime, we use another thread that occasionally reads said data and synchronizes them up to our central server. However we've noticed problems that in some occasions objects get synced over without all the collection objects.
Since this only happens every once in a while we figured it could be the transaction isolation level. Maybe the synchronization thread reads the data before the transaction is done persisting all the objects, thus only reading half the data needed, and sending it over.
Because we all know that the clients data is all saved, all the time, it's just that sometimes it doesn't tag along when it's being sent to us.
So we'd want some kind of lock I suppose, I just don't know anything about these locks. Which one should we use?
There are no outside sources working towards the database in this case, since it's a WPF application on a client's customer.
Any help would be appreciated!
Best regards,
E.
Every database supports a set of standard isolation levels. These are all meant to prevent to a certain level that you read data that is modified inside another transaction. I suggest you first read up on what these isoloation levels mean.
In your specific situation, I'd suggest that for the transaction that is reading the data, you use at least an isolation level of ReadCommitted. In code, this would look like this:
using (var transactionScope = new TransactionScope(TransactionScopeOption.Required,
new TransactionOptions { IsolationLevel = IsolationLevel.ReadCommitted }))
{
// Read the data you want from the database.
...
transactionScope.Complete();
// Return the data.
}
Using a TransactionScope with IsolationLevel.ReadCommitted prevents that you read data that has not yet been committed by another transaction.
The code that writes data to the database should also be put inside one transaction. As long as you only write data inside that transaction, the isolation level for that transaction doesn't matter. This guarantees the atomicity of your updates: either all updates succeed or none of them. This also prevents another transaction from reading a partial update.

Strange behaviour of code inside TransactionScope?

We are facing a very complex issue in our production application.
We have a WCF method which creates a complex Entity in the database with all its relation.
public void InsertEntity(Entity entity)
{
using(TransactionScope scope = new TransactionScope())
{
EntityDao.Create(entity);
}
}
EntityDao.Create(entity) method is very complex and has huge pieces of logic. During the entire process of creation it creates several child entities and also have several queries to database.
During the entire WCF request of entity creation usually Connection is maintained in a ThreadStatic variable and reused by the DAOs. Although some of the queries in DAO described in step 2 uses a new connection and closes it after use.
Overall we have seen that the above process behaviour is erratic. Some of the queries in the inner DAO does not even return actual data from the database? The same query when run to the actaul data store gives correct result.
What can be possible reason of this behaviour?
ThreadStatic is not recommended. Use CallContext instead. I have code at http://code.google.com/p/softwareishardwork/ which demos the correct way to handle connections in a manner you describe (tested in severe high performance scenarios). Try a test case using this code.