Repeating a query does not refresh the properties of the returned objects - nhibernate

When doing a criteria query with NHibernate, I want to get fresh results and not old ones from a cache.
The process is basically:
Query persistent objects into NHibernate application.
Change database entries externally (another program, manual edit in SSMS / MSSQL etc.).
Query persistence objects (with same query code), previously loaded objects shall be refreshed from database.
Here's the code (slightly changed object names):
public IOrder GetOrderByOrderId(int orderId)
{
...
IList result;
var query =
session.CreateCriteria(typeof(Order))
.SetFetchMode("Products", FetchMode.Eager)
.SetFetchMode("Customer", FetchMode.Eager)
.SetFetchMode("OrderItems", FetchMode.Eager)
.Add(Restrictions.Eq("OrderId", orderId));
query.SetCacheMode(CacheMode.Ignore);
query.SetCacheable(false);
result = query.List();
...
}
The SetCacheMode and SetCacheable have been added by me to disable the cache. Also, the NHibernate factory is set up with config parameter UseQueryCache=false:
Cfg.SetProperty(NHibernate.Cfg.Environment.UseQueryCache, "false");
No matter what I do, including Put/Refresh cache modes, for query or session: NHibernate keeps returning me outdated objects the second time the query is called, without the externally committed changes. Info btw.: the outdated value in this case is the value of a Version column (to test if a stale object state can be detected before saving). But I need fresh query results for multiple reasons!
NHibernate even generates an SQL query, but it is never used for the values returned.
Keeping the sessions open is neccessary to do dynamic updates on dirty columns only (also no stateless sessions for solution!); I don't want to add Clear(), Evict() or such everywhere in code, especially since the query is on a lower level and doesn't remember the objects previously loaded. Pessimistic locking would kill performance (multi-user environment!)
Is there any way to force NHibernate, by configuration, to send queries directly to the DB and get fresh results, not using unwanted caching functions?

First of all: this doesn't have anything to do with second-level caching (which is what SetCacheMode and SetCacheable control). Even if it did, those control caching of the query, not caching of the returned entities.
When an object has already been loaded into the current session (also called "first-level cache" by some people, although it's not a cache but an Identity Map), querying it again from the DB using any method will never override its value.
This is by design and there are good reasons for it behaving this way.
If you need to update potentially changed values in multiple records with a query, you will have to Evict them previously.
Alternatively, you might want to read about Stateless Sessions.

Is this code running in a transaction? Or is that external process running in a transaction? If one of those two is still in a transaction, you will not see any updates.
If that is not the case, you might be able to find the problem in the log messages that NHibernate is creating. These are very informative and will always tell you exactly what it is doing.
Keeping the sessions open is neccessary to do dynamic updates on dirty columns only
This is either the problem or it will become a problem in the future. NHibernate is doing all it can to make your life better, but you are trying to do as much as possible to prevent NHibernate to do it's job properly.
If you want NHibernate to update the dirty columns only, you could look at the dynamic-update-attribute in your class mapping file.

Related

Why is NHibernate cascading before executing a query?

So I'm using the Nhibernate.Linq API to run a query. The query itself only takes about 50 ms, but the total time spent by NHibernate to return the result is around 500ms.
The query is something like as follows, session.Query<T>().Where(i => i.ForeignKey == someValue).Take(5).ToList().AsQueryable(). And this query executes in less than ~50ms from the debug log of creating the HQLQueryPlan, through setting up params, opening the connection, hydrating the C# objects and finishing.
However, before it even began getting that far, the debug log shows that NHibernate spent ~400ms cascading saves and updates, and then reports a bundle of Collection found: logs which appear to be Collection properties of the objects that I'm asking for. I should add the ISession being used is read only, and has absolutely not modified or created any entity. After spending these 400ms, it then logs that it flushed 0 changes.
Why is NHibernate cascading a load of save/update commands during execution of a .Query<T>(), is it trying to make sure that the correct data is retrieved? Is this the result of some configuration that I have set?
Such behavior is possible for queries executed only inside transaction and when:
You have set FlushMode to FlushMode.Always
Or NHibernate detected that your query involves tables that are modified in current session. This behaviour can be disabled by setting FlushMode to values less than FlushMode.Auto (that's default) (so it's FlushMode.Commit or FlushMode.Manual)
You can change FlushMode for your session or specify default flush mode in default_flush_mode configuration setting. See spec for details

ASP.NET Core - caching users from Identity

I'm working with a standard ASP.NET Core 2.1 program, and I've been considering the problem that a great many of my controller methods require the current logged on user to be retrieved.
I noticed that the asp.net core Identity code uses a DBSet to hold the entities and that subsequent calls to it should be reading from the local entities in memory and not hitting the DB, but it appears that every time, my code requires a DB read (I know as I'm running SQL Profiler and see the select queries against AspNetUsers being run using Id as the key)
I know there's so many ways to set Identity up, and its changed over the versions that maybe I'm not doing something right, or is there a fundamental problem here that could be addressed.
I set up the default EF and Identity stores in startup.cs's ConfigureServices:
services.AddDbContext<MyDBContext>(options => options.UseSqlServer(Configuration.GetConnectionString("MyDBContext")));
services.AddIdentity<CustomIdentity, Models.Role>().AddDefaultTokenProviders().AddEntityFrameworkStores<MyDBContext>();
and read the user in each controller method:
var user = await _userManager.GetUserAsync(HttpContext.User);
in the Identity code, it seems that this method calls the UserStore FindByIdAsync method that calls FindAsync on the DBSet of users.
the EF performance paper says:
It’s important to note that two different ObjectContext instances will have two different ObjectStateManager instances, meaning that they have separate object caches.
So what could be going wrong here, any suggestions why ASP.NET Core's EF calls within Userstore are not using the local DBSet of entities? Or am I thinking this wrongly - and each time a call is made to a controller, a new EF context is created?
any suggestions why ASP.NET Core's EF calls within Userstore are not using the local DBSet of entities?
Actually, FindAsync does do that. Quoting msdn (emphasis mine)...
Asynchronously finds an entity with the given primary key values. If an entity with the given primary key values exists in the context, then it is returned immediately without making a request to the store. Otherwise, a request is made to the store for an entity with the given primary key values and this entity, if found, is attached to the context and returned. If no entity is found in the context or the store, then null is returned.
So you can't avoid the initial read per request for the object. But subsequent reads in the same request won't query the store. That's the best you can do outside crazy levels of micro-optimization
Yes. Controller's are instantiated and destroyed with each request, regardless of whether it's the same or a different user making the request. Further, the context is request-scoped, so it too is instantiated and destroyed with each request. If you query the same user multiple times during the same request, it will attempt to use the entity cache for subsequent queries, but you're likely not doing that.
That said, this is a text-book example of premature optimization. Querying a user from the database is an extremely quick query. It's just a simple select statement on a primary key - it doesn't get any more quick or simple as far as database queries go. You might be able to save a few milliseconds if you utilize memory caching, but that comes with a whole set of considerations, particularly being careful to segregate the cache by user, so that you don't accidentally bring in the wrong data for the wrong user. More to the point, memory cache is problematic for a host of reasons, so it's more typical to use distributed caching in production. Once you go there, caching doesn't really buy you anything for a simple query like this, because you're merely fetching it from the distributed cache store (which could even be a database like SQL Server) instead of your database. It only makes sense to cache complex and/or slow queries, as it's only then that retrieving it from cache actually ends up being quicker than just hitting the database again.
Long and short, don't be afraid to query the database. That's what it's there for. It's your source for data, so if you need the data, make the query. Once you have your site going, you can profile or otherwise monitor the performance, and if you notice slow or excessive queries, then you can start looking at ways to optimize. Don't worry about it until it's actually a problem.

NHibernate mapping very slow

I am using nhibernate, to create a collection of immutable domain objects from a legacy oracle DB. some simple lookup using Criteria api take over 60 seconds. If subsequent lookups of the same lookup is very fast usually less than 300ms (100ms in db and rest by nhibernate, i dont have 2-level cache or query cache enabled, all queries do go the DB I checked using nhibernate prof). If however i leave the app idle for a couple of minutes and run the lookup again it takes usualy 50-60 secs,
I have used nhibernate profiler and in every case its clearly showing only at the most 100ms is spend in database, i figure the rest of the time must be taken by nhibernate, I cant understand why ?
Some background info :
I am using dynamic-component to map a 20 columns into key value
pairs.
Using nhibernate 2.1
i am using dynamic-component in the mapping
Once retrieved the data is never modified, in mapping i am
using mutable=false flag.
its a legacy db so i am using a composite
key in the mapping.
I am only retriving around 50 objects in each lookup
When I open session I have set FlushMode=Never
I also tried stateless session (still have slow performance on initial lookup)
I dont not define or use any custom user types in the mapping
I am clearly doing something wrong or missed some thing, any ideas ?
I suggest downloading a C# performance profiler such as dotTrace. You will be able to quickly get a more accurate understanding of where your performance problem is. I'm pretty sure it is not an NHibernate mapping issue.
How is the lifetime of your SessionFactory being managed? Is it possible that your SessionFactory is being disposed of after some period of inactivity?
It is most likely not an Nhibernate issue.
Use the code below to figure the amount of time it takes to get your data back. (db+network_latency+nhibernate_execution)
Once you are positive that there is no APP related latency involved, check the database by looking at the query plan caching and query result caching. The first time the query runs, a cache miss, your db will invest in time-consuming and intensive operations to generate the resultset.
If 1 and 2 don't yield any useful information, check your network. Maybe some network pressure is causing heavy latency.
As mentioned by JeffreyABecker below, study how your session factories get disposed/created. Find usages of ISessionFactory.Dispose() or configuration.BuildSessionFactory(). Building ISessionFactory objects is an expensive operation and, typically, you should create them on application start and dispose them on application stop/shutdown. 60s> it is still a sound number for ISessionFactory instantiation.
//Codez
Stopwatch stopwatch = new Stopwatch();
// Begin timing
stopwatch.Start();
// Nhibernate specific stuff ONLY in here
// Depending on your setup, do a session.Flush(); if possible.
// End Timing
stopwatch.Stop();
// Write result - console/log4net/diagnostics.debug/etc
Console.WriteLine("Time elapsed: {0}",stopwatch.Elapsed);

NHibernate Caching Dilemma

My application includes a client, web tier (load balanced), application tier (load balanced), and database tier. The web tier exposes services to clients, and forwards calls onto the application tier. The application tier then executes queries against the database (using NHibernate) and returns the results.
Data is mostly read, but writes occur fairly frequently, particularly as new data enters the system. Much more often than not, data is aggregated and those aggregations are returned to the client - not the original data.
Typically, users will be interested in the aggregation of recent data - say, from the past week. Thus, to me it makes sense to introduce a cache that includes all data from the past 7 days. I cannot just cache entities as and when they are loaded because I need to aggregate over a range of entities, and that range is dictated by the client, along with other complications, such as filters. I need to know whether - for a given range of time - all data within that range is in the cache or not.
In my ideal fantasy world, my services would not have to change at all:
public AggregationResults DoIt(DateTime starting, DateTime ending, Filter filter)
{
// execute HQL/criteria call and have it automatically use the cache where possible
}
There would be a separate filtering layer that would hook into NHibernate and intelligently and transparently determine whether the HQL/criteria query could be executed against the cache or not, and would only go to the database if necessary. If all the data was in the cache, it would query the cached data itself, kind of like an in-memory database.
However, on first inspection, NHibernate's second level cache mechanism does not seem appropriate for my needs. What I'd like to be able to do is:
Configure it to always have the last 7 days worth of data in the cache. eg. "For this table, cache all records where this field is between 7 days ago and now."
Have the ability to manually maintain the cache. As new data enters the system, it would be nice if I could just throw it straight into the cache rather than waiting until the cache is invalidated. Similarly, as data falls out of the time period, I'd like to be able to pull it from the cache.
Have NHibernate intelligently understand when it can serve a query directly from the cache rather than hitting the database at all. eg. If the user asks for an aggregate of data over the past 3 days, that aggregation should be calculated directly from the cache rather than touching the DB.
Now, I'm pretty sure #3 is asking too much. Even if I can get the cache populated with all the data required, NHibernate has no idea how to efficiently query that data. It would literally have to loop over all entities in order to discriminate which are relevant to the query (which might be fine, to be honest). Also, it would require an implementation of NHibernate's query engine that executed against objects rather than a database. But I can dream, right?
Assuming #3 is asking too much, I would require some logic in my services like this:
public AggregationResults DoIt(DateTime starting, DateTime ending, Filter filter)
{
if (CanBeServicedFromCache(starting, ending, filter))
{
// execute some LINQ to object code or whatever to determine the aggregation results
}
else
{
// execute HQL/criteria call to determine the aggregation results
}
}
This isn't ideal because each service must be cache-aware, and must duplicate the aggregation logic: once for querying the database via NHibernate, and once for querying the cache.
That said, it would be nice if I could at least store the relevant data in NHibernate's second level cache. Doing so would allow other services (that don't do aggregation) to transparently benefit from the cache. It would also ensure that I'm not doubling up on cached entities (once in the second level cache, and once in my own separate cache) if I ever decide the second level cache is required elsewhere in the system.
I suspect if I can get a hold of the implementation of ICache at runtime, all I would need to do is call the Put() method to stick my data into the cache. But this might be treading on dangerous ground...
Can anyone provide any insight as to whether any of my requirements can be met by NHibernate's second level cache mechanism? Or should I just roll my own solution and forgo NHibernate's second level cache altogether?
Thanks
PS. I've already considered a cube to do the aggregation calculations much more quickly, but that still leaves me with the database as the bottleneck. I may well use a cube in addition to the cache, but the lack of a cache is my primary concern right now.
Stop using your transactional ( OLTP ) datasource for analytical ( OLAP ) queries and the problem goes away.
When a domain significant event occurs (eg a new entity enters the system or is updated), fire an event ( a la domain events ). Wire up a handler for the event that takes the details of the created or updated entity and stores the data in a denormalised reporting store specifically designed to allow reporting of the aggregates you desire ( most likely push the data into a star schema ). Now your reporting is simply the querying of aggregates ( which may even be precalculated ) along predefined axes requiring nothing more than a simple select and a few joins. Querying can be carried out using something like L2SQL or even simple parameterised queries and datareaders.
Performance gains should be significant as you can optimise the read side for fast lookups across many criteria while optimising the write side for fast lookups by id and reduced index load on write.
Additional performance and scalability is also gained as once you have migrated to this approach, you can then physically separate your read and write stores such that you can run n read stores for every write store thereby allowing your solution to scale out to meet increased read demands while write demands increase at a lower rate.
Define 2 cache regions "aggregation" and "aggregation.today" with a large expiry time. Use these for your aggregation queries for previous days and today respectively.
In DoIt(), make 1 NH query per day in the requested range using cacheable queries. Combine the query results in C#.
Prime the cache with a background process which calls DoIt() periodically with the date range that you need to be cached. The frequency of this process must be lower than the expiry time of the aggregation cache regions.
When today's data changes, clear cache region "aggregation.today". If you want to reload this cache region quickly, either do so immediately or have another more frequent background process which calls DoIt() for today.
When you have query caching enabled, NHibernate will pull the results from cache if possible. This is based on the query and parameters values.
When analyzing the NHibernate cache details i remember reading something that you should not relay on the cache being there, witch seems a good suggestion.
Instead of trying to make your O/R Mapper cover your applications needs i think rolling your own data/cache management strategy might be more reasonable.
Also the 7 days caching rule you talk about sounds like something business related, witch is something the O/R mapper should not know about.
In conclusion make your app work without any caching at all, than use a profiler (or more - .net,sql,nhibernate profiler ) to see where the bottlenecks are and start improving the "red" parts by eventually adding caching or any other optimizations.
PS: about caching in general - in my experience one caching point is fine, two caches is in the gray zone and you should have a strong reason for the separation and more than two is asking for trouble.
hope it helps

NHibernate, ActiveRecord, Transaction database locks and when Commits are flushed

This is a common question, but the explanations found so far and observed behaviour are some way apart.
We have want the following nHibernate strategy in our MVC website:
A SessionScope for the request (to track changes)
An ActiveRecord.TransactonScope to wrap our inserts only (to enable rollback/commit of batch)
Selects to be outside a Transaction (to reduce extent of locks)
Delayed Flush of inserts (so that our insert/updates occur as a UoW at end of session)
Now currently we:
Don't get the implied transaction from the SessionScope (with FlushAction Auto or Never)
If we use ActiveRecord.TransactionScope there is no delayed flush and any contained selects are also caught up in a long-running transaction.
I'm wondering if it's because we have an old version of nHibernate (it was from trunk very near 2.0).
We just can't get the expected nHibernate behaviour, and performance sucks (using NHProf and SqlProfiler to monitor db locks).
Here's what we have tried since:
Written our own TransactionScope (inherits from ITransactionScope) that:
Opens a ActiveRecord.TransactionScope on the Commit, not in the ctor (delays transaction until needed)
Opens a 'SessionScope' in the ctor if none are available (as a guard)
Converted our ids to Guid from identity
This stopped the auto flush of insert/update outside of the Transaction (!)
Now we have the following application behaviour:
Request from MVC
SELECTs needed by services are fired, all outside a transaction
Repository.Add calls do not hit the db until scope.Commit is called in our Controllers
All INSERTs / UPDATEs occur wrapped inside a transaction as an atomic unit, with no SELECTs contained.
... But for some reason nHProf now != sqlProfiler (selects seems to happen in the db before nHProf reports it).
NOTE
Before I get flamed I realise the issues here, and know that the SELECTs aren't in the Transaction. That's the design. Some of our operations will contain the SELECTs (we now have a couple of our own TransactionScope implementations) in serialised transactions. The vast majority of our code does not need up-to-the-minute live data, and we have serialised workloads with individual operators.
ALSO
If anyone knows how to get an identity column (non-PK) refreshed post-insert without a need to manually refresh the entity, and in particular by using ActiveRecord markup (I think it's possible in nHibernate mapping files using a 'generated' attribute) please let me know!!