Scenario: I have built an ASP.NET MVC application that manages my cooking recipes. I am using FluentNHibernate to access data from the following tables:
Users
Categories
Recipes
RecipeCategories (many-to-many junction table)
UserBookmarkedRecipes (many-to-many junction table)
UserCookedRecipes (many-to-many junction table)
Question: Is there any way to tell NHibernate to load all data from all tables listed above and store it in memory / in NHibernate's cache so that there do not have to be any additional database requests?
Motivation behind question: The variety of many-to-many relationships poses a problem and would greatly benefit from that optimization.
Note regarding data: The overall amount of data is extremely small. We are talking about less than 100 recipes at the moment.
Instead of preloading everything, I would suggest loading once and then holding it as it's accessed.
NHibernate maintains two different caches, and does a pretty good job of keeping them in sync with your underlying data store. By default, it uses what is called a "first level" cache on a per-session basis but I don't think that's what you want. You can read about the differences at the nhibernate faq page on caching
I suspect a second level cache is what you need (this is available throughout your app). You'll need to get a cache provider from NHContrib (download the version that matches your version of NHibernate). The SysCache2 provider will probably be the easiest to set up for your scenario, as long as your app will be the ONLY thing writing to the database. If other processes will be writing, you will need to ensure that all are using the same cache as an intermediary if you want it to stay in sync.
The second level cache is configured with a timeout that you can set to whatever you need. Don't think it can be infinite but you can set it to long periods if you want (probably not a terrible idea to go back to DB from time to time though). If you want to preload everything up front, you can simply access all your entities from your global.asax's Application_Start method, but this shouldn't be necessary.
You will need to configure your session factory to use the cache. Call the .Cache(...) method when fluently configuring your session factory, it should be relatively self-explanatory.
You will also need to set Cache.ReadWrite() in both your entity mappings AND your relationship mappings. You can do this by convention or by calling Cache.ReadWrite() in your fluent mappings.
Something like:
public class RecipeMap : ClassMap<Recipe> {
public RecipeMap () {
Cache.ReadWrite();
Id(x => x.Id);
HasManyToMany(x => x.Ingredients).Cache.ReadWrite();
}
}
On the Cache calls in your mappings you can specify ReadOnly or whatever else you may need. NonStrictReadWrite is an interesting one, it can boost performance significantly but at an increased risk of reading stale data from the cache.
Related
I'm working with a standard ASP.NET Core 2.1 program, and I've been considering the problem that a great many of my controller methods require the current logged on user to be retrieved.
I noticed that the asp.net core Identity code uses a DBSet to hold the entities and that subsequent calls to it should be reading from the local entities in memory and not hitting the DB, but it appears that every time, my code requires a DB read (I know as I'm running SQL Profiler and see the select queries against AspNetUsers being run using Id as the key)
I know there's so many ways to set Identity up, and its changed over the versions that maybe I'm not doing something right, or is there a fundamental problem here that could be addressed.
I set up the default EF and Identity stores in startup.cs's ConfigureServices:
services.AddDbContext<MyDBContext>(options => options.UseSqlServer(Configuration.GetConnectionString("MyDBContext")));
services.AddIdentity<CustomIdentity, Models.Role>().AddDefaultTokenProviders().AddEntityFrameworkStores<MyDBContext>();
and read the user in each controller method:
var user = await _userManager.GetUserAsync(HttpContext.User);
in the Identity code, it seems that this method calls the UserStore FindByIdAsync method that calls FindAsync on the DBSet of users.
the EF performance paper says:
It’s important to note that two different ObjectContext instances will have two different ObjectStateManager instances, meaning that they have separate object caches.
So what could be going wrong here, any suggestions why ASP.NET Core's EF calls within Userstore are not using the local DBSet of entities? Or am I thinking this wrongly - and each time a call is made to a controller, a new EF context is created?
any suggestions why ASP.NET Core's EF calls within Userstore are not using the local DBSet of entities?
Actually, FindAsync does do that. Quoting msdn (emphasis mine)...
Asynchronously finds an entity with the given primary key values. If an entity with the given primary key values exists in the context, then it is returned immediately without making a request to the store. Otherwise, a request is made to the store for an entity with the given primary key values and this entity, if found, is attached to the context and returned. If no entity is found in the context or the store, then null is returned.
So you can't avoid the initial read per request for the object. But subsequent reads in the same request won't query the store. That's the best you can do outside crazy levels of micro-optimization
Yes. Controller's are instantiated and destroyed with each request, regardless of whether it's the same or a different user making the request. Further, the context is request-scoped, so it too is instantiated and destroyed with each request. If you query the same user multiple times during the same request, it will attempt to use the entity cache for subsequent queries, but you're likely not doing that.
That said, this is a text-book example of premature optimization. Querying a user from the database is an extremely quick query. It's just a simple select statement on a primary key - it doesn't get any more quick or simple as far as database queries go. You might be able to save a few milliseconds if you utilize memory caching, but that comes with a whole set of considerations, particularly being careful to segregate the cache by user, so that you don't accidentally bring in the wrong data for the wrong user. More to the point, memory cache is problematic for a host of reasons, so it's more typical to use distributed caching in production. Once you go there, caching doesn't really buy you anything for a simple query like this, because you're merely fetching it from the distributed cache store (which could even be a database like SQL Server) instead of your database. It only makes sense to cache complex and/or slow queries, as it's only then that retrieving it from cache actually ends up being quicker than just hitting the database again.
Long and short, don't be afraid to query the database. That's what it's there for. It's your source for data, so if you need the data, make the query. Once you have your site going, you can profile or otherwise monitor the performance, and if you notice slow or excessive queries, then you can start looking at ways to optimize. Don't worry about it until it's actually a problem.
When doing a criteria query with NHibernate, I want to get fresh results and not old ones from a cache.
The process is basically:
Query persistent objects into NHibernate application.
Change database entries externally (another program, manual edit in SSMS / MSSQL etc.).
Query persistence objects (with same query code), previously loaded objects shall be refreshed from database.
Here's the code (slightly changed object names):
public IOrder GetOrderByOrderId(int orderId)
{
...
IList result;
var query =
session.CreateCriteria(typeof(Order))
.SetFetchMode("Products", FetchMode.Eager)
.SetFetchMode("Customer", FetchMode.Eager)
.SetFetchMode("OrderItems", FetchMode.Eager)
.Add(Restrictions.Eq("OrderId", orderId));
query.SetCacheMode(CacheMode.Ignore);
query.SetCacheable(false);
result = query.List();
...
}
The SetCacheMode and SetCacheable have been added by me to disable the cache. Also, the NHibernate factory is set up with config parameter UseQueryCache=false:
Cfg.SetProperty(NHibernate.Cfg.Environment.UseQueryCache, "false");
No matter what I do, including Put/Refresh cache modes, for query or session: NHibernate keeps returning me outdated objects the second time the query is called, without the externally committed changes. Info btw.: the outdated value in this case is the value of a Version column (to test if a stale object state can be detected before saving). But I need fresh query results for multiple reasons!
NHibernate even generates an SQL query, but it is never used for the values returned.
Keeping the sessions open is neccessary to do dynamic updates on dirty columns only (also no stateless sessions for solution!); I don't want to add Clear(), Evict() or such everywhere in code, especially since the query is on a lower level and doesn't remember the objects previously loaded. Pessimistic locking would kill performance (multi-user environment!)
Is there any way to force NHibernate, by configuration, to send queries directly to the DB and get fresh results, not using unwanted caching functions?
First of all: this doesn't have anything to do with second-level caching (which is what SetCacheMode and SetCacheable control). Even if it did, those control caching of the query, not caching of the returned entities.
When an object has already been loaded into the current session (also called "first-level cache" by some people, although it's not a cache but an Identity Map), querying it again from the DB using any method will never override its value.
This is by design and there are good reasons for it behaving this way.
If you need to update potentially changed values in multiple records with a query, you will have to Evict them previously.
Alternatively, you might want to read about Stateless Sessions.
Is this code running in a transaction? Or is that external process running in a transaction? If one of those two is still in a transaction, you will not see any updates.
If that is not the case, you might be able to find the problem in the log messages that NHibernate is creating. These are very informative and will always tell you exactly what it is doing.
Keeping the sessions open is neccessary to do dynamic updates on dirty columns only
This is either the problem or it will become a problem in the future. NHibernate is doing all it can to make your life better, but you are trying to do as much as possible to prevent NHibernate to do it's job properly.
If you want NHibernate to update the dirty columns only, you could look at the dynamic-update-attribute in your class mapping file.
At the company I work we have a single database schema but with each of our clients using their own dedicated database, with one central database that stores client contact details and what database the client is using so we can connect to the appropriate database. I've looked at using NHibernate Shards but it seems to have gone very quiet and doesn't look complete.
Does anyone know the status of this project? Has anyone used it in production?
If it's not yet at a point that is considered usable in production, what are the alternatives? The two main ones seem to be:
Create a session factory per database and then a wrapper to select the appropriate factory to generate the correct session - this seems to me to have redundant session factories and not too efficient
Create just one session factory but when calling opensession pass it an IDbConnection - which would allow the session to have a different database connection.
My concern with 2 is how will NHibernate cope with a 2nd level cache as I believe it is controlled by the session factory - also the HiLo generator uses the session factory I believe. In these cases will having sessions attach to different dbs cause problems? For example we will end up with a MyCompany.Model.User class that has an id of 2 in both databases will this cause conflicts within the cache?
You could have a look at Enzo SQL Shard a sharding library for SQL Server. If you are already using NHibernate there might be a few changes required in the code though
NHibernate Shards is up-to-date with the latest NHibernate API changes and now supports all query models of NHibrrnate, including Linq. Complex scalar queries are currently not supported.
We use it in production for a multi-tenant environment, but there are a few things to be mindful of.
NHibernate Shards has a session factory for each shard, but only uses a single NHibernate Configuration instance to generate the session factories. This approach likely won't scale well to large numbers of shards.
Querying across shards does not play well with paging. It works, but can involve considerable client-side processing. It is best to keep result sets as small as possible and lock queries to single shards where feasible.
I've got a web application that is 99% read-only, with a separate service that updates the database at specific intervals (like every 10 minutes). How can this service tell the application to invalidate it's second-level cache? Is it actually important? (I don't actually care if I have too much stale data) If I don't invalidate the cache how much time is needed to the records to get updated (if using SysCache)
You can manually dispose your 2nd level cache for a specific entity, entity type or a collection.
From http://knol.google.com/k/fabio-maulo/nhibernate-chapter-16-improving/1nr4enxv3dpeq/19#
For the second-level cache, there are methods defined on ISessionFactory for evicting the cached state of an instance, entire class, collection instance or entire collection role.
sessionFactory.Evict(typeof(Cat), catId); //evict a particular Cat
sessionFactory.Evict(typeof(Cat)); //evict all Cats
sessionFactory.EvictCollection("Eg.Cat.Kittens", catId); //evict a particular collection of kittens
sessionFactory.EvictCollection("Eg.Cat.Kittens"); //evict all kitten collections
If you are OK with the possibility of having some stale data, just set the default expiration to something you are comfortable with, and you'll be set.
Example:
<property name="cache.default_expiration">120</property>
This sets the default expiration to two minutes, so you'll never see stale data older than that.
What are your experiences with the latest version of NHibernate (2.0.1 GA) regarding disconnected scenarios?
A disconnected scenario is where I fetch some object graph from NHibernate, disconnect from the session (and database connection), do some changes in the object graph (deleting in collections, adding entities, updating entities) and then reconnect and save....
We tried this in a client-server architecture. Now we are moving to DTO (data transfer objects). This means, the detached entities are not directly sent to the client anymore, but specialized objects.
The main reason to move in this direction is not NHibernate, it is actually the serialization needed to send entities to the client. While you could use lazy-loading (and you will!) while you are attached to the session, you need to get all references from the database to serialize it.
We had lots of Guids instead of references and lots of properties which are mapped but not serialized ... and it became a pain. So it's much easier to copy the stuff you really want to serialize to its own structure.
Besides of that - working detached could work well.
Be careful with lazy loading, which will cause exceptions to be thrown when accessing non loaded objects on a detached instance.
Be careful with concurrency, the chance that entities had changed while they where detached is high.
Be careful if you need some sort of security or even if you want your server alown to make some data changes. The detached objects could potentially return in any state.
You may take a look on session methods SaveOrUpdateCopy and Merge.
Here is an article which gives you more details:
NHibernate feature: SaveOrUpdateCopy & Merge