One database, two applications, 2nd-level caching and NHibernate - nhibernate

What do I need to know when setting up caching using NHibernate, in the case that I have two applications running on different servers, but only one database. Are table dependencies generally sufficient to make sure that weird caching problems don't arise? If so, what sort of polltime should I look at?

well in order for nhibernate to check for concurrency issues you can add a field to your entities. That will cause nhibernate to throw a concurrency exception when trying to update an entity that has been modified by someone else.
If you want to use the second level cache with multiple servers I can recommend a distributed implementation of the nhibernate second level cache, for example NCache:
http://www.alachisoft.com/ncache/nhibernate_index.html

Related

Consistency/Atomicity (or even ACID) properties in multiple SQL/NoSQL databases architecture

I'm rather used to use one database alone (say PostgreSQL or ElasticSearch).
But currently I'm using a mix (PG and ES) in a prototype app and may throw other kind of dbs in the mix (eg: redis).
Say some piece of data need to be persisted to each databases in a different way.
How do you keep a system consistent in the event of a failure on one of the components/databases ?
Example scenario that i'm facing:
Data update on PostgreSQL, ElasticSearch is unavailable.
At this point, the system is inconsistent, as I should have updated both databases.
As I'm using an SQL db, I can simply abort the transaction to put the system in its previous consistent state.
But what is the best way to keep the system consistent ?
Check everytime that the value has been persisted in all databases ?
In case of failure, restore the previous state ? But in some NoSQL databases there is no transaction/ACID mechanism, so I can't revert as easily the previous state.
Additionnaly, if multiple databases must be kept in sync, is there any good practice to have, like adding some kind of "version" metadata (whether a timestamp or an home made incrementing version number) so you can put your databases back in sync ? (Not talking about CouchDB where it is built-in!)
Moreover, the databases are not all updated atomically so some part are inconsistent for a short period. I think it depends on the business of the app but does anyone have some thought about the problem that my occur or the way to fix that ? I guess it must be tough and depends a lot of the configuration (for maybe very few real benefits).
I guess this may be a common architecture issue but I'm having trouble to find information on the subject.
Keep things simple.
Search engine can and will lag behind sometimes. You may fight it. You may embrace it. It's fine, and most of the times its acceptable.
Don't mix the data. If you use Redis for sessions - good. Don't store stuff from database A in B and vice versa.
Select proper database with ACID and strong consistency for your Super Important Business Data™®.
Again, do not mix the data.
Using more than one database technology in one product is a decision one shouldn't make light-hearted. The more technologies you use the more complex your project will become in development, deployment, maintenance and administration. Also, every database technology will become an individual point of failure. That means it is often much wiser to stick to one technology, even when it means that you need to make some compromises.
But when you have good(!) reason to use multiple DBMS, you should try to keep them as separated as possible. Avoid placing related data spanning multiple databases. When possible, no feature should require more than one DBMS to work (preferably a failure of the DBMS would only affect those features which use it). Storing redundant data in two different DBMS should also be avoided.
When you can't avoid redundancies and relationships spanning multiple DBMS, you should decide on one system to be the single source of truth (preferably one which you trust most regarding consistency). When there are inconsistencies between systems, they should be resolved by synchronizing the data with the SSOT.

What includes EclipseLink internal optimization via weaving

I am new in EclipseLink and just right now I am getting know it step by step. Right now I am working on performance optimizations via weaving inb order to use lazy loading for ***ToOne relationships, fetch groups for partial loading of entity instances, change tracking for commit performance optimizations and internal optimizations for ... And here the question is. Unfortunately I haven't found via googling a the right performances via this tactic.
Does somebody could explain what kind of internal optimizations EclipseLink performs via this weaving setting ?
Thanks in advance,
Simeon
I'd recommend you break up your question to make it more specific on what exactly you are looking for, but I'll try to add information.
Weaving allows EclipseLink to change the bytecodes of your entities to add provider specific methods etc so that you do not need to introduce a dependency within your model. Each of the terms listed in the doc you found - lazy loading, fetch goups etc - are all performance enhancements that you would need to look up individually. All can be used without weaving, but would require changes to your entity to implement EclipseLink interfaces and methods.
Lazy loading delays fetching a relationship until your application accesses it. getEmployee() in your entity for instance will just return the reference employee attribute - without weaving, the employee must have been fetched already or a null will be returned incorrectly. With weaving, code can be added to the Entity so that it goes to the database to fetch it on demand.
Fetch groups are similar concept that apply to basic mappings instead of relationships, while change tracking is more advanced and allows EclipseLink to be notified when you make a change to the entity rather than having to compare changes with a prebuilt backup on commit. Each will have independent references within the EclipseLink documentation.

Is there any reason I shouldn't cache in nHibernate?

I've just discovered the joy of Cache.ReadWrite() in fluent nHibernate, and have been analyzing the results with nhprof extensively.
It seems to be quite useful, but that seems a bit deceptive. Is there any particular reason I wouldn't want to cache a very frequently used object from a query? I mean, I have to presume I should not just go around decorating every single Mapping with a Cache property ... or should I?
As usual, it depends :)
If something has potential to be updated by background processes that don't use the second level cache, or changed directly in the database, caching will cause problems.
Entities that are infrequently accessed may not be good candidates for second level caching either, as they will just take up space.
Also, you may see some weirdness if you have collections mapped as Inverse - the changes will not be picked up by the second level cache correctly and you'll need to manually evict the collection.
As sJhonny points out below, if you have a web farm scenario (or any where your app is running on several servers) you'll need to use a distributed cache (like memcached) instead of the built in ASP.net cache.

JBoss TreeCache vs PojoCache when using invaludation rather than replication

We are setting up a Jboss cluster and we are building an own distributed cache solution built upon Jboss cache (Cant use it as 2nd level cache to ORM layer in our case). We want to use invalidation and not replication as cache mode. As far as i can see after (very) little testing both solutions seem to work, objects are put into the cache and objects seem to be evicted when they are updated on any of the servers.
This leads me to believe that PojoCache with AOP instrumentation is only needed when using replication so that you can replicate only updated field values and not whole objects. Am I correct here or are there any other advantages with using PojoCache over TreeCache in our scenario? And if PojoCache have advantages, do we still need AOP instrumentation and to annotate our entities with #PojoCacheable (yes, we are using JBCache 1.4.1) since we are not using relication?
Regards
Jonas Heineson
PoJoCache has the ability through AOP to:
only replicate changed fields and not whole objects. Makes a difference if e.g. your person object containes a huge image of the person and you only change the password
detect changes and thus can automatically put them on the list to be replicated.
TreeCache (plain) does not need AOP, but can thus not replicate individual fields or detect what has changed so that you need to trigger replication yourself.
If you don't replicate, those points are probably irrelevant.
IIrc, you don't need the #PojocaCacheable annotation for Pojo cache - without it, you need to specify the classes to be enhanced in a different way.
I have the feeling that if you are not replicating, the plain TreeCache will be enough.

NHibernate cache expiration

I use custom developed ORM currently and am planing to move to nhibernate.
Currently, I use both L1 - session level caching and L2 - Application level caching.
Whenever an object is requested from L2 cache by L1 cache, it checks database for modified since last load, and loads only if it has been modified.
Can I do this with NHibernate. In short, caching does not hurt me as it always gets most recent data and saves me object creation and load times.
IMHO it's pointless to have an L2 cache if it needs to hit the DB anyway. That's precisely the entire point of caching, avoid hitting the DB as much as possible.
AFAIK there is no caching strategy implemented like the one you describe, but NHibernate L2 caches are entirely pluggable so you could implement it. However, I wouldn't, for the reasons I mentioned above.
Getting outdated data is only an issue if there are other apps or other DALs hitting the same DB besides NHibernate. If that's the case, you could use the SysCache2 implementation, which internally uses SqlCacheDependencies to invalidate cache regions when data in the underlying table changes.
If it's a single app running in a farm, use the Velocity provider.
If there's only one NHibernate app instance hitting the DB, any cache strategy will do and you don't have to worry about getting outdated data.
See also:
NHibernate docs about 2nd-level caching
NHibernate 2nd Level Cache # NHibernate Forge
First and Second Level caching in NHibernate # The NHibernate FAQ
Ayende's posts about NHibernate caching
The build-in Level1 cache in NHibernate is not very sophisticated as it stand alone and in-proc in nature. So you definitely need to have a second level cache in order to enhance the performance of the NHibernate app. It reduces time taking trips to database. There are many third party integrations available that plug in for NHibernate secondary level cache. NCache is one fine example of it where no code change is required. Read more from here,
http://www.alachisoft.com/ncache/nhibernate-l2cache-index.html