NHibernate Second Level Cache with database change notification on desktop App - nhibernate

I am developing a WPF application using NHibernate to communicate with a PostgreSQL Database.
The only caching provider that works on a desktop app is Bamboo Prevalence (correct me if I am wrong). Given that every computer running my application will have different Session Factory, my application retrieves stale data from the cache.
My question is, how can I tell NHibernate/Prevalence to look at the timestamp of when the data was last updated, and if the cache is stale, refresh it?

Well, I found out that there is no way the Second Level cache can know if the database was changed outside Nhibernate/Cache, so what I did was creating a new column 'Timestamp' on all my tables.
On my queries, I first select the timestamp of the db using Session.Cachemode(CacheMode.Ignore) to get the timestamp of the db and I compare with the result from the cache. In the case the timestamps differ, I invalidate the cache for that query and run it again.
About the SysCache, even knowing it 'can work' on a WPF desktop app, I was not keen to use System.Web.Cache as my application would need the the complete .Net Framework instead of the Client Profile. I did a search and for my happiness someone wrote a Nhiberate cache proviver that implements (System.Runtime.Caching), witch is not a ASP.Net component. If anyone is interested you can find the source at:
https://github.com/Leftyx/nhcontrib/tree/master/src/NHibernate.Caches/MemoryCache

Well that is a property that you could set at the cache level and expire items according to your applications needs and then have the cache. Ncache is a possible L2 cache provider for NHibernate. NCache ensures that its cache is consistent across multiple servers and all cache updates are synchronized correctly so no data integrity issues arise. To learn more please visit:
http://www.alachisoft.com/ncache/nhibernate-l2cache-index.html

Related

ASP.NET Core - caching users from Identity

I'm working with a standard ASP.NET Core 2.1 program, and I've been considering the problem that a great many of my controller methods require the current logged on user to be retrieved.
I noticed that the asp.net core Identity code uses a DBSet to hold the entities and that subsequent calls to it should be reading from the local entities in memory and not hitting the DB, but it appears that every time, my code requires a DB read (I know as I'm running SQL Profiler and see the select queries against AspNetUsers being run using Id as the key)
I know there's so many ways to set Identity up, and its changed over the versions that maybe I'm not doing something right, or is there a fundamental problem here that could be addressed.
I set up the default EF and Identity stores in startup.cs's ConfigureServices:
services.AddDbContext<MyDBContext>(options => options.UseSqlServer(Configuration.GetConnectionString("MyDBContext")));
services.AddIdentity<CustomIdentity, Models.Role>().AddDefaultTokenProviders().AddEntityFrameworkStores<MyDBContext>();
and read the user in each controller method:
var user = await _userManager.GetUserAsync(HttpContext.User);
in the Identity code, it seems that this method calls the UserStore FindByIdAsync method that calls FindAsync on the DBSet of users.
the EF performance paper says:
It’s important to note that two different ObjectContext instances will have two different ObjectStateManager instances, meaning that they have separate object caches.
So what could be going wrong here, any suggestions why ASP.NET Core's EF calls within Userstore are not using the local DBSet of entities? Or am I thinking this wrongly - and each time a call is made to a controller, a new EF context is created?
any suggestions why ASP.NET Core's EF calls within Userstore are not using the local DBSet of entities?
Actually, FindAsync does do that. Quoting msdn (emphasis mine)...
Asynchronously finds an entity with the given primary key values. If an entity with the given primary key values exists in the context, then it is returned immediately without making a request to the store. Otherwise, a request is made to the store for an entity with the given primary key values and this entity, if found, is attached to the context and returned. If no entity is found in the context or the store, then null is returned.
So you can't avoid the initial read per request for the object. But subsequent reads in the same request won't query the store. That's the best you can do outside crazy levels of micro-optimization
Yes. Controller's are instantiated and destroyed with each request, regardless of whether it's the same or a different user making the request. Further, the context is request-scoped, so it too is instantiated and destroyed with each request. If you query the same user multiple times during the same request, it will attempt to use the entity cache for subsequent queries, but you're likely not doing that.
That said, this is a text-book example of premature optimization. Querying a user from the database is an extremely quick query. It's just a simple select statement on a primary key - it doesn't get any more quick or simple as far as database queries go. You might be able to save a few milliseconds if you utilize memory caching, but that comes with a whole set of considerations, particularly being careful to segregate the cache by user, so that you don't accidentally bring in the wrong data for the wrong user. More to the point, memory cache is problematic for a host of reasons, so it's more typical to use distributed caching in production. Once you go there, caching doesn't really buy you anything for a simple query like this, because you're merely fetching it from the distributed cache store (which could even be a database like SQL Server) instead of your database. It only makes sense to cache complex and/or slow queries, as it's only then that retrieving it from cache actually ends up being quicker than just hitting the database again.
Long and short, don't be afraid to query the database. That's what it's there for. It's your source for data, so if you need the data, make the query. Once you have your site going, you can profile or otherwise monitor the performance, and if you notice slow or excessive queries, then you can start looking at ways to optimize. Don't worry about it until it's actually a problem.

Refresh redis cache on DB change

I've got a stored procedure that loads some data (about 59k items) and it takes 30 seconds. This SP must be called when the application starts. I was wondering if there's a reasonable way to invalidate the Redis cache entry via SQL ...any suggestion?
Thanks
Don't do it from your SQL, do the invalidation / (re)loading to Redis from your application.
The loading of this data into your application should be done by a separate component/service/module/part of your application. So that part should have all the responsibility of handling the needed data, including (re)loading it into the app, invalidating and reloading into Redis and so on. You should see your Redis server as an extension of your application cached data and not of your sql server data. That's why you should not link your relational database to your Redis. If you are going to change how you save this data into Redis that should not affect the SQL part, but only the application, and actually only the part of your application specialized with this.

How can we setup DB and ORM for the absence of Data Consistency requierement?

Imagine we have a web-site which sends write and read requests into some DB via Hibernate. I use Java, but it doesn't matter for this question.
Usually we want to read the fresh data from DB. But I want to introduce some delay between the written data becomes visible to reads just to increase the performance. I.e. I dont need to "publish" the rows inserted into DB immediately. Its OK for me to "publish" fresh data after some delay.
How can I achieve it?
As far as I understand this can be set up on several different tiers of my system.
I can cache some requests in front-end. Probably I should set up proxy server for this. But this will work only if all the parameters of the query match.
I can cache the read requests in Hibernate. OK, but can I specify or estimate the average time the read query will return stale data after some fresh insert occurred? In other words how can I control the delay time between fresh data becomes visible to the users?
Or may be I should use something like a memcached system instead of Hibernate cache?
Probably I can set something in DB. I dont know what should I do with DB. Probably I can ease the isolation level to burst the performance of my DB.
So, which way is the best one?
And the main question, of course: does the relaxation of requirements I introduce here may REALLY help to increase the performance of my system?
If I am reading your architecture correct you have client -> server -> database server
Answers to each point
This will put the burden on the client to implement the caching if you only use your own client I would go for this method. It will have the side effect of improving client performance possibly and put less load on the server and database server so they will scale better.
Now caching on the server will improve scalability of the database server and possibly performance in the client but will put a memory burden on the server. This would be my second option
Implement something in the database. At this point what are you gaining? the database server still has to do work to determine what rows to send back. And also you will get no scalability benefits.
So to sum up I would cache at the client first if you can if not cache at the server. Leave the DB out of the loop.
To answer your main question - caching is one of the most effective ways of increasing both performance and scalability of web applications which are constrained by database performance - your application may or may not fall into this category.
In general, I'd recommend setting up a load testing rig, and measure the various parts of your app to identify the bottleneck before starting to optimize.
The most effective cache is one outside your system - a CDN or the user's browser. Read up on browser caching, and see if there's anything you can cache locally. Browsers have caching built in as a standard feature - you control them via HTTP headers. These caches are very effective, because they stop requests even reaching your infrastructure; they are very efficient for static web assets like images, javascript files or stylesheets. I'd consider a proxy server to be in the same category. The major drawback is that it's hard to manage this cache - once you've said to the browser "cache this for 2 weeks", refreshing it is hard.
The next most effective caching layer is to cache (parts of) web pages on your application server. If you can do this, you avoid both the cost of rendering the page, and the cost of retrieving data from the database. Different web frameworks have different solutions for this.
Next, you can cache at the ORM level. Hibernate has a pretty robust implementation, and it provides a lot of granularity in your cache strategies. This article shows a sample implementation, including how to control the expiration time. You get a lot of control over caching here - you can specify the behaviour at the table level, so you can cache "lookup" data for days, and "transaction" data for seconds.
The database already implements a cache "under the hood" - it will load frequently used data into memory, for instance. In some applications, you can further improve the database performance by "de-normalizing" complex data - so the import routine might turn a complex data structure into a simple one. This does trade of data consistency and maintainability against performance.

Google App Engine automatically updating memcache

So here's the problem, I've created a database model. When I create the model, a = Model(args), and then perform a.put(), GAE seems to automatically update the memcache, because all the data seems up-to-date even without me hitting the database. Logging the number of elements in the cache works also shows the correct number of elements. But I'm not manually updating the cache. How do I prevent this? Cheers.
You can set policy functions:
Automatic caching is convenient for most applications but maybe your application is unusual and you want to turn off automatic caching for some or all entities. You can control the behavior of the caches by setting policy functions.
Memcache Policy
That's for NDB. You don't say what language/DB you are using but I'm sure it's all similar.

How to refresh NHibernate cached data on WPF?

I have this WPF application using NHibernate and lazy data loading. I also use Microsoft Sync framework to sync data to and from a central database server. So what happens is that when I modify data on the central database server and sync it with the WPF client app, I can't get the latest data to be displayed to the UI since NHibernate has cached it already. So I need to restart the WPF application to be able to display the latest synched data.
I need a solution to refresh NHibernate data on the WPF app. How can I do this?
The first level cache is only active within a given session, so if you use a new session each time you attempt to retrieve the object you should get the latest results (given you don't have the second level cache configured).